
United States Patent and Trademark Office 



UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark Office 
Address: COMMISSIONER FOR PATENTS 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 
www.uspto.gov 



APPLICATION NO. 



FILING DATE 



FIRST NAMED INVENTOR 



ATTORNEY DOCKET NO. 



CONFIRMATION NO. 



10/038,977 



12/31/2001 



Douglas Neal Fuller 



DFO 1-001 



9586 



7590 03/15/2005 

Dr. Douglas Neal Fuller 
P.O. Box 450936 
Atlanta, GA 31145-0936 



EXAMINER 



ZHU, JERRY 



ART UNIT 



PAPER NUMBER 



2121 

DATE MAILED: 03/15/2005 



Please find below and/or attached an Office communication concerning this application or proceeding. 

RECEIVED 

MAR 2 5 2005 
Technology Center 21 00 



PTO-90C (Rev. 10/03) 



Office Action Summary 


Application No. 

10/038,977 


Applicant(s) 

FULLER, DOUGLAS NEAL 


Examiner 

Jerry Zhu 


Art Unit 
2121 





- The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 



Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 . 1 36(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 
Any reply received by the Office later *han three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status - 1 

I) D Responsive to communication(s) filed on . 

2a)D This action is FINAL. 2b)D This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 21 3. 

Disposition of Claims 

4) G<] Claim(s) 1-21 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) KI Claim(s) 1-21 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) n Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)Q The drawing(s) filed on is/are: a)D accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

I I) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12)D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
a)D All b)Q Some * c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2. Q Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 



Attachments) 

1) Si Notice of References Cited (PTO-892) 

2) C] Notice of Draftsperson's Patent Drawing Review (PTO-948) 

3) / £3jnformation Disclosure Statement(s) (PTO-1449 or PTO/SB/08) 

Paper No(s)/Mail Date . 



4) □ Interview Summary (PTO-413) 

Paper No(s)/Mail Date. . 

5) n Notice of Informal Patent Application (PTO-152) 

6) □ Other: . 



U.S. Patent and Trademark Office 

PTOL-326 (Rev. 1-04) 



Office Action Summary 



Part of Paper No./Mail Date 20050201 



fig 



Application/Control Number: 10/038,977 Page 2 

Art Unit: 2121 

DETAILED ACTION 



Claim Rejections - 35 USC § 101 



35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or 
composition of matter, or any new and useful improvement thereof, may obtain a patent 
therefor, subject to the conditions and requirements of this title. 

the invention as disclosed in claims 1-21 is directed to non-statutory subject 

matter. 



1. Claims 1-21 are method claims whose steps are not practiced on a computer, 
electronic devices, electrical machines, mechanical apparatus, or anything 
concrete and tangible instruments or equipments. These steps are just 
abstract procedures manipulating abstract concepts. Therefore, it is clear 
that these claims are not limited to practice in the technological arts. On that 

• basis alone, they are clearly nonstatutory. 

2. Regardless of whether any of the claims are in the technological arts, claims 
1-21 are just manipulating abstract ideas. Congress intended statutory 
subject matter to 'include anything under the Sun that is made by man/" 
Diamond v. Diehr, 450 U.S. at 182, 209 USPQ at 6. "This Court has 

undoubtably recognized limits to §101 and every discovery is not embraced 
within the statutory terms. Excluded from such patent protection are laws of 



Application/Control Number: 10/038,977 Page 3 

Art Unit: 2121 

nature, physical phenomena and abstract ideas." Id. at 185, 209 USPQ at 7. 
A claim that covers any and every possible way that the steps can be 
performed is a disembodied "abstract idea" because there is no particular 
implementation of the idea. See Gottschalk vs. Benson, 409 U.S. at 68, 175 
USPQ at 675 (The Supreme Court discussed the cases holding that a 
principle, in the abstract, cannot be patented and then stated: "Here is the 
'process' claim is so abstract and sweeping as to cover both known and 
unknown uses of the BCD to pure binary conversion. The end use may ... be 
performed through any existing machinery or future-devised machinery or 
without any apparatus.") 

Furthermore, in the case In re Warmerdam, the Federal Circuit held that: 



... mhe dispositive issue for assessing compliance with Section 101 
in this case is whether the claim is for a process that goes beyond 
simply manipulating 'abstract ideas' or 'natural phenomena" ... As 
the Supreme Court has made clear, *[a]n idea of itself is not 
patentable, ... taking several abstract ideas and manipulating them 
together adds nothing to the basic equation . In re Warmerdam 31 
USPQ2d at 1759 (emphasis added). 



Application/Control Number: 10/038,977 



Art Unit: 2121 



Page 



Since the Federal Circuit held in Warmerdam that this is the "dispositive 
issue" when it judged the usefulness, concreteness, and tangibility of the 
claim limitations in that case. Examiner in the present case views this holding 
as the dispositive issue for determining whether a claim is "useful, concrete, 
and tangible" in similar cases. Accordingly, the Examiner finds that the 
method claims manipulate a set of abstract ideas such as "population," 
"members," "rules," and "behavior." (i.e., what population it is? Population of 
marbles, animals, vehicles, people how have pets?) Clearly, manipulation of 
abstract ideas such as parameters, characteristics of abstract population is 
provably even more abstract (and thereby less limited in practical application) 
than pure "mathematical algorithms" which the Supreme Court has held are 
per se nonstatutory - in fact, it includes the expression of nonstatutory 
mathematical algorithms. Since the claims are not limited to exclude such 
abstractions, the broadest reasonable interpretation of the claim limitations 
includes such abstractions. Therefore, the claims are impermissibly abstract 
under 35 U.S.C. §101. 

3. Regardless of whether any of the claims are abstract nor not, none of them is 
limited to practical applications in the technological arts. There is no physical 
transformation either inside or outside of a computer as the result of 
performing the method. Examiner finds that In re Warmerdam, 33 F.3d 1354, 
31 USPQ2d 1754 (Fed. Cir. 1994) controls the 35 USC §101 issues on that 



Application/Control Number: 10/038,977 Page 
Art Unit: 2121 

point for reasons made clear by the Federal Circuit in AT&T Corp. v. Excel 
Communications, Inc., 50 USPQ2d 1447 (Fed. Cir. 1999). Specifically, the 
Federal Circuit held that the act of: 



...[T]aking several abstract ideas and manipulating them together 
adds nothing to the basic equation. AT&T v. Excel at 1453 quoting 
In re Warmerdam, 33 F.3d 1354. 1360 (Fed. Cir. 1994). 

Examiner finds no evidence in the claims that manipulating "sub-population" 
using "rules" produces any concrete, tangible, practical, chemical, physical, or 
business transformation. 

Examiner bases his position upon guidance provided by the Federal Circuit in 
In re Warmerdam, as interpreted by AT&T v. Excel. This set of precedents is 
within the same line of cases as the Alappat-State Street Bank decisions and 
is in complete agreement with those decisions. Warmerdam is consistent with 
State Streets holding that: 



Today we hold that the transformation of data, representing discrete 
dollar amounts, by a machine through a series of mathematical 
calculations into a final share price, constitutes a practical 
application of a mathematical algorithm, formula, or calculation 
because it produces 'a useful, concrete and tangible result" - a final 
share price momentarily fixed for recording purposes and even 
accepted and relied upon by regulatory authorities and in 
subsequent trades, (emphasis added) Sfafe Sfreef Bank at 1601. 



Application/Control Number: 10/038,977 Page 6 

Art Unit: 2121 

That case later eliminated the "business method exception" in order to 
show that business methods were not per se nonstatutory, but the court 
clearly did not go so far as to make business methods per se statutory. A 
plain reading of the excerpt above shows that the Court was very specific 
in its definition of the new practical application. It would have been much 
easier for the court to say that "business methods were per se statutory" 
than it was to define the practical application in the case as "...the 
transformation of data, representing discrete dollar amounts, by a machine 
through a series of mathematical calculations into a final share price..." 

Additionally, the court was also careful to specify that the "useful, concrete 
and tangible result" it found was "a final share price momentarily fixed for 
recording purposes and even accepted and relied upon by regulatory 
authorities and in subsequent trades ." (i.e. the trading activity is the further 
practical use of the real world monetary data beyond the transformation in 
the computer - i.e., "post-processing activity".) 

Applicant cites no such specific results to define a useful, concrete and 
tangible result. Neither does Applicant specify the associated practical 
application with the kind of specificity the Federal Circuit used. 



Application/Control Number: 10/038,977 Page 7 

Art Unit: 2121 

Assuming that the claims fall within the category of a "process" under 
§101, the steps are so broadly recited, without regard to any tangible way 
of implementing them, that they are directed to the "abstract idea" itself 
and the claims are nonstatutory subject matter under the "abstract idea" 
exception. The abstract ideas comprising the steps are not instantiated 
into some specific physical implementation. Nor are there any minor 
physical acts, such as recording, that might be construed as an 
implementation of the abstract idea. 

Where a claim is broad enough to read on both statutory subject matter 
(machine implementation or physical transformation of physical subject 
matter) as well as nonstatutory subject matter (an abstract idea), the best 
position is to hold the claimed subject matter to be nonstatutory because, 
while a claim is a pending and can be amended, a claim's meaning should 
be delimited by express terms rather than claim interpretation. Cf. In re 
Lintner, 458 F. 2d 1013, 1015, 173 USPQ 560, 562 (CCPA 1972) 
("Claims which are broad enough to read on obvious subject matter are 
unpatentable even though they also read on non-obvious subject 
matter."). 



Conclusion 



• Application/Control Number: 10/038,977 



Page 8 



Art Unit: 2121 

Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Jerry Zhu whose telephone number is (571) 
2724237. The examiner can normally be reached on 8:30 - 5. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Anthony Knight can be reached on (571) 272-3687. The 



fax phone number for the organization where this application or proceeding is 
assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). 

Jerry Zhu 
Examiner 
Art Unit -2121 

2/1/2005 X? /? 




Anthony Knight 
Supervisory Patent Examiner 
Tech Center 21 00 



PTO/SB/08A (10-01) 
Approved for use through 10/31/2002. OMB 0651*0031 
U.S. Palenl and Trademark Office: U.S. DEPARTMENT OF COMMERCE 
Under the Paperwork Reduction Ad of 1995, no persons are required to respond to a collection of Information unless It contains a valid OMS 
conlroTnuji^ej^^ ^^^^ mmm ^^^ m ^ mm ^^^^^^^m 



+ 



Substitute for form 1449A/PTO 

INFORMATION DISCLOSURE 
STATEMENT BY APPLICANT 

(use as many sheets as necessary) 



Sheet 



1 



of 



Complete if Known 



Application Number 



Filing Date 



First Named Inventor 



Art Unit 



Examiner Name 



Attorney Docket Number 



FULLER 



DF01-001 




+ 



U.S. PATENT DOCUMENTS ' "1 


Examiner 
initials* 


Cite 
No. 1 


pnrnfnpnf Numhflf 


Publleatlon Date 
MM.DD-YYYY 


Name of Patentee or ! 
Applicant of Cited Document 


Pages, Columns, Lines, Where 

RnlAvstnt PttftsanAV nr RnLnvartl 

r>QlwTQIIl i 0900^09 SJ 1 fAOIWTCIIH 

Figures Appear 


Numbe r- Kind Code 1 (ifkncwr 


1* 




.us- 6,272,483 


...8/7/20.0.1 


Joslin,..CIements 




us. 6259,339 


7/31/2001 










us- 6,223,164 


5/24/2001 
7/.1.1/2Q0.0. 

_i2/z/um__ 


Seare etal 




....J..*- 
1? 

■S\. 

a.r.. 




.us; .6.08.8.5.10. 

.!*:. 6..Q2&39Z 


.Sima 




...J.ync,b.etal 






...^tieppaixl 




us : . 6.272.47.8. 


.....8/7/200.1 


Qhata.et.al 




....I*.. 
itr. 




..V.?:. 6.212.526 

..us; 6J.25.362 


.....4/3/2.00.1. 






9/26/.2QD.0 


Swarthy 


/ 


us- 5,787.420 


7/29/2Q.Q.1 






VC. 

7*- 




us- 5,712,984 

u.s : . 6.029.138. 

us- 6,202,053 


_J/27/199.8. 

...2/22/20.QQ 

3/13/2001 


JHammoi)d..et.al. 




Khorasami..et.aL 




Christiansen et al. 




ai?. 

5*~ 




us- 6.266.656 
us- '6,182,070 " 
us- 5.187.673 


7/24/2001 


Qhm. 




1/30/2001 .. 
2/16/1993 


MegidslQ..GiaL 




^S&mLJuML — 




us"~ 5,790,758 


8/4/1998 


Streit 








US- 












US- 








us- 















FOR 


pQN PATENT fi. 








Examiner 
Initials* 


CHe 
No. 1 


Foreion Patent Document 


Publication Date 
MM-DD-YYYY 


Name of Patentee or 
Applicant of Cited Document 


Pages. Columns, lines, 
Where Relevant Passages 
or Relevant Figures Appear 


T 6 


country Cod«3 -Number 4 . Kind Code 4 



































































































































Examiner 
Signature 



Date 

Considered 



<9i 



ol 



Ca MVIINER inl ial tf refefenh^onsidered, whether or not citation b in conformance with MPEP 609. OraJ line through citation if not in 
conformance and not considered. Include copy of this form with next communication to applicant. ( f 

1 Applicant's unique citation designation number (optional). 2 See Kinds Codes of USPTO Patent Documents at www^spto^cv or MPEP 
901 04. 3 Enter Office that Issued the document, by the two-letter code (WlPO Standard ST.3). 4 For Japanese patent documents, the 
indication of the year of the reign of the Emperor must precede the serial number of the patent document. 5 Kind of document by the 
appropriate symbols as indicated on the document under WtPO Standard ST. 16 if possible. 6 Applicant is to place a check mark here if 
English language Translation is attached. 

Burden Hour Statement: This form is estimated to take 2.0 hours to complete. Time will vary depending upon the needs of the individual case. 
Anv comments on the amount of time you ere required to complete this term should be sent to the Chief Information Officer, U.S. Patent and 
Trademark Office. Washington. OC 20231. DO NOT SEND FEES OR COMPLETED FORMS TO THIS ADDRESS. SEND TO: Assistant 
Commissioner for Patents, Washington, OC 20231. 



PTO/SB/06B (10-01) 
Approved for use through 10731/2002. OMB 0651-0031 
U.S. Patent and Trademark Office: U.S. DEPARTMENT OF COMMERCE 
Under the Paperwork Reduction Act of 1995, no persons are required lo respond to a collection of information unless it contains a valid OMB 
control number. 



Substitute for form 1449B/PTO 

information! DISCLOSURE 

STATEMENT BY APPLICANT 

fuse as many sheets as necessary] 



Complete if Known 



Application Number 



Filing Date 



First Named Inventor 



Group Art Unit 



Examiner Name 



FULLER 




Sheet 



of 



Attorney Docket Number DF01 -001 



OTHER PRIOR ART - NON PATENT LITERATURE DOCUMENTS 



Examiner 
Initials* 


Ote 
No. 1 


Include name of the author (In CAPITAL LETTERS), title of the article (when appropriate), title of the 
item (book, magazine, journal, serial, symposium, catalog, etc.), dale, page(s), volume-issue 


T 2 




1. 


Anderbera. Michael. Cluster Analvsis for Applications. Academic Press. New York. 1973. 






o 


tack HH Classification and Related Methods of Data Analvsis Elsevier Science 
Publishers, North Holland, 1988 




3. 


Celeux, Gilles and Soromenho, Gilda, "An Entropy Criterion for Assessing the Number of 
Clusters in a Mixture Model," Journal of Classification, 13, 195-212, 1996 





It 


4. 


Fowles JB et al, "Taking health status into account when setting capitation rates: a 
comparison of risk-adjustment methods," JAMA 276(16): 1316-21 , 1996. 




5. 


Hornbrook, Mark C & Goodman Michael J. f "Assessing Relative Health Plan Risk with the 
Rand 36 Health Survey," Inquiry, 32:1 , 56-74, 1995. 






67 


Hornbrook, Mark C & Goodman Michael J. f "Chronic Disease, Functional Health Status, 
and Demographics: A Multi-Dimensional Approach to Risk Adjustment," Health Services 
Research, 31:3, 283-307, 1996.. 






7. 


Lamers, LM & van Vliet RC, "Multiyear diagnostic information from prior hospitalization as 
a risk-adjuster for capitation payments," Medical Care, 34:4 549-561 , 1 996. 






8. 


Lorr. Maurice. Cluster Analvsis for Social Scientists. Jossev-Bass Publishers, San 
Francisco,1983. 






9. 


Matthews, Georff rey & Hearne, James, "Clustering Without a Metric," IEEE Transactions 
on Pattern Analysis and Machine Intelligence, V1 3N2, p1 75-1 84, 1 991 . 




Tr 


10. 


Soromenho, Gilda, "Comparing Approaches for Testing the Number of Components in a 
Finite Mixture Model," Computational Statistics, 9, 65-78,1994. . 






11. 


Nord, Erik, "Health Status Index Models for use in Resource Allocation Decisions," 
International Journal of Technology Assessment in Health Care, 12:1,1 996. 





Examiner 
Signature 



Considered 



'EXAMINER: Initial if reference corj^fSered, whether or not citation is In conformance with MPEP 609. Draw line throudn citation £ not in conformance 
and not considered. Include copy of this form with next communication to applicant. 
1 Applicant's unique citation designation number (optional). 2 Applicant is to place a check mark here If English language Translation is attached. 
Burden Hour Statement: This form Is estimated to lake 2.0 hours to complete. Time will vary depending upon the needs of the individual case. Any 
comments on the amount of time you are required to complete this form should be sent to the Chief Information Officer, U.S. Patent and Trademark 
Office, Washington, OC 20231. 00 NOT SEND FEES OR COMPLETED FORMS TO THIS ADDRESS. SEN0 TO: Assistant Commissioner for Patents. 
Washington, DC 20231. 



PTO/SB/08B (10-01) 
Approved for use tfflWgh 10/31/2002. OMB 0651*0031 
U.S. Patent and Trademartc Office: U.S. DEPARTMENT OF COMMERCE 
Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless h contains a valid OMB 
control number. 



+ 



Substitute for form 1449B/PTO 



INFORMATION DISCLOSURE 
STATEMENT BY APPLICANT 

(use as many sheets as necessary) 



I Sheet 



of 



Complete if Known 



Application Number 



Filing Bate 



First Named Inventor 



Group Art Unit 



Examiner Name 



Attorney Docket Number 



FUllER 



DF01-001 




OTHER PRIOR ART - NON PATENT LITERATURE DOCUMENTS 


Examiner 
Initials" 


Cite 
No. 1 


Include name of the author (in CAPITAL LETTERS), title of the article (when appropriate), title of the 
item (book, magazine, journal, serial, symposium, catalog, etc.), dale, page(s), volume-issue 
ntimharfal nirhlUhor rttv anrffar cnunlrv u/h»rp> ruihttehwH 




ft 


12. 


Anderson, Richard V., "Can Risk- Assessment Tools be Feasibly Used in the Health 
BeneTit MarKetpiacer , Advances in HBaitn economics ana rie&itn ben/ices Hesesrcn, 
12, JAI Press, 1991. _ 




J 2^ 


13. 


Goodman, Michael J. et al, "Persistence of Health Care Expense in an Insured Working 
rupuiduoii, r\uvances in ri Gai in ccvnofjiics anu neaiin o&rviGeo nes&arun, \c. l JMI 

Press 1991. 




14. 




Fuller; Douglas N.i " AN EXPLOfiATRSN 0? POPULAT ION CLASSiFi'CATION FOR 
MANAGED HEALTHCARE WITHIN A STATE-BASED MODELING FRAMEWORK" 
University of Virginia Dissertation, dated Jan. 2000, published May 22, 2000 (see 




attachecfletter from Director of Cataloging Services, University of Virginia Library). 



























































Examiner 




Date 




Signature 




99rffW?wl 





'EXAMINER: Initial if roferenc^onsidered, whether or not citation is In conformance with MPEP 609. Draw line through citation if not in conformance 
and not considered, include copy of this form with next communication to applicant. 

1 Applicant's unique citation designation number (optional). 2 Applicant Is to place a check mark here if English language Translation is attached. 

Burden Hour Statement: This form is estimated to take 2.0 hours to complete. Time wDI vary depending upon the needs of the individual case. Any 
comments on the amount of time you are required to complete this form should be sent to the Chief Information Officer, U.S. Patent and Trademark 
Office, Washington, DC 20231. DO NOT SENO FEES OR COMPLETED FORMS TO THIS ADDRESS. SEND TO: Assistant Commissioner for Patents. 
Washington. DC 20231. 



Notice of References Cited 


Application/Control No. 
10/038.977 


Applicant(s)/Patent Under 

Reexamination 

FULLER, DOUGLAS NEAL 


examiner 
Jerry Zhu 


Art Unit 
2121 


Page 1 of 1 



U.S. PATENT DOCUMENTS 



* 




Document Number 
Country Code-Number-Kind Code 


Date 
MM-YYYY 


Name 


Classification 




A 


US- 










B 


US- 










C 


US- 










D 


US- 










E 


US- 






* i 




F 


US- 










G 


us- 










H 


us- 












us- 










J 


us- 










K 


us- 










L 


us- 










M 


us- 








FOREIGN PATENT DOCUMENTS 


* 




Document Number 
Country Code-Number-Kind Code 


Date 
MM-YYYY 


Country 


Name 


Classification 




N 














0 














P 














Q 














R 














S 














T 












NON-PATENT DOCUMENTS 


* 




Include as applicable: Author, Title Date, Publisher, Edition or Volume, Pertinent Pages) 




U 


Ferry Butar Butar, "Empirical Bayes Methods in Survey Sampling," August, 1997, 




V 






w 






X 





*A copy ot this reference is not being furnished with this Office action. (See MPEP § 707.05(a).) 
Dates in MM-YYYY format are publication dates. Classifications may be US or foreign. 

U.S. Patent and Trademark Office 



PTO-892 (Rev. 01-2001) Notice of References Cited Part of Paper No. 20050201 



EMPIRICAL BAYES METHODS IN SURVEY SAMPLING 



by 

Ferry Butar Butar 
A DISSERTATION 



Presented to the Faculty of 
The Graduate College at the University of Nebraska 
In Partial Fulfillment of Requirements 
For the Degree of Doctor of Philosophy 



Major: Mathematics and Statistics 

Under the Supervision of Professor Parthasarathi Lahiri 

Lincoln, Nebraska 
August, 1997 



INFORMATION TO USERS 



This manuscript has been reproduced from the microfilm master. UMI 
films the text directly from the original or copy submitted. Thus, some 
thesis and dissertation copies are in typewriter face, while others may be 
from any type of computer printer. 

The quality of this reproduction is dependent upon the quality of the 
copy submitted* Broken or indistinct print, colored or poor quality 
illustrations and photographs, print bleedthrough, substandard margins, 
and improper alignment can adversely affect reproduction. 

In the unlikely event that the author did not send UMI a complete 
manuscript and there are missing pages, these will be noted. Also, if 
unauthorized copyright material had to be removed, a note will indicate 
the deletion. 

Oversize materials (e.g., maps, drawings, charts) are reproduced by 
sectioning the original, beginning at the upper left-hand corner and 
continuing from left to right in equal sections with small overlaps. Each 
original is also photographed in one exposure and is included in reduced 
form at the back of the book. 

Photographs included in the original manuscript have been reproduced 
xerographically in this copy. Higher quality 6" x 9" black and white 
photographic prims are available for any photographs or illustrations 
appearing in this copy for an additional charge. Contact UMI directly to 
order. 

UMI 

A Bell & Howell Information Company 
300 North Zeeb Road, Ann Arbor MI 48106-1346 USA 
313/76M700 800/521-0600 



UMI Number: 9736923 



UMI Microform 9736923 
Copyright 1997, by UMI Company. All rights reserved. 

This microform edition is protected against unauthorized 
copying under Title 17, United States Code. 



UMI 

300 North Zeeb Road 
Ann Arbor, MI 48103 



4 



DISSERTATION TITLE 
Empirical Bayes Methods in Survey Sampling 



BY 

Ferry Butar Butar 

SUPERVISORY COMMITTEE: 

APPROVED 

(2 £*d±~L 

Signature 

Parthasarathi Lahlri 

Typed Name ™" ™-™~--™ 

Signature 

K.M. Lai Saxena 

Typed Name — — - — _ 

7//j 

Signature 

Allan L. McCutcheon 

Typed Name -— — 

//> /^-^ 

Signature 

Colin M, Ramsay 

Typed Name ' 

. Signature ™~" — - 



Typed Name 
Signature 



Typed Name 



DATE 

ib±Jn 



fB^fc GRADUATE COLLEGE 

UNIVERSITY OF NEBRASKA 



EMPIRICAL BAYES METHODS IN SURVEY SAMPLING 



Ferry Butar Butar, Ph.D. 
University of Nebraska, 1997 

Advisor: Parthasarati Lahiri 

This dissertation concerns two problems in survey sampling: (a) small-area 
estimation and (b) estimation in finite population sampling. Both the topics have 
received considerable attention in recent years. 

Empirical Bayes method has been found to be very useful in small area esti- 
mator and finite population sampling. The method is very effective in combining 
relevant information from the sample surveys, various administrative records and 
the census data. 

The first half of the dissertation is devoted to small area estimation. In large 
scale national sample surveys, the sampling designs are determined so as to obtain 
reliable estimates of various characteristics of interest at the national level. Due to 
the availability of relatively small samples, the regular designed-based estimators 
perform poorly at the subnational level (e.g., state, county, etc.) when compared 
to the corresponding estimator at the national level. Similar situation arises when 
estimates are needed for a subgroup of the population obtained by classifying 
the population according to various demographic characteristics (e.g., age* race, 
sex, etc.). Such problems in survey sampling literature are known as small area 
estimation problems. Reliable small area statistics are needed in regional planning 
and in allocation of government resources. 

The following research has been conducted in the small area estimation prob- 



lems: 

» 

(a) A unified model is proposed which covers various specific small area 
models considered in the literature; 

(b) A general measure of uncertainty of the proposed empirical Bayes esti- 
mator is considered and 

(c) Small area estimation method under a random sampling variance model 
is developed. 

Later part of the dissertation concerns empirical Bayes estimation of different 
stratum means and variances when samples are obtained using a stratified simple 
random sampling design. The method is effective specially when a moderately 
large samples are available from any given stratum. There are three main features 
of this research: 

(a) In order to reduce the effect of overshrinking bias associated with the 
usual empirical Bayes procedures, stratum specific random effects are 
introduced through the sampling variances; 

(b) General measures of uncertainty are proposed for the empirical Bayes 
point estimators of finite population means and variances; 

(c) Laplace's second order approximation is used to approximate the one- 
dimensional integrals involved in the empirical Bayes point estimators 
and the measures of uncertainty of the point estimators. The approxi- 
mation is specially helpful in obtaining the measures of uncertainty of 
the empirical Bayes estimators since the measures are based on Monte 
Carlo methods where checking the accuracy of the numerical integration 
method at each step of the replication is troublesome. 



■I \ 



to the memory of my father 



iv 



ACKNOWLEDGMENTS 



I would like to express my sincere gratitude to Professor Parthasarathi 
Lahiri for being my major advisor and originally proposing the problems contained 
in the dissertation. I got his support and encouragement all through. Without his 
enormous patience, encouragement and guidence. it. would not have been possible 
to complete this dissertation. I consider myselflucky to have him as my dissertation 
advisor. 

I would like to thank Professors K.M. Lai Saxena, Allan L. McCutcheon 
and Colin M. Ramsay for serving on my committee. 

I would also like to acknowledge Professor Arijit Chaudhuri of the Indian 
Statistical Institute for his assistance and support. 

I share this achievement with my mother, my wife Rosmauly and my chil- 
dren Artha, Belinda, Cornelius for their constant love, patience and encouragement 
and dedicate it to the memory of my father. 



CONTENTS 



page 

ABSTRACT (ij) 

ACKNOWLEDGEMENTS : (v) 

LIST OF TABLES (viii) 

CHAPTER 

1. Introduction I 

1.1 Literature Review I 

1.2 The Subject of This Dissertation S 

2. Estimation in a Mixed Linear Normal Model 10 

2.1 Introduction 10 

2.2 Two Models 10 

2.3 Empirical Bayes Point Estimation 11 

2.4 A Measure of Uncertainty 14 

2.5 Two Examples 22 

2.6 Simulation Experiment 24 

Appendix to Chapter 2: Proofs of Theorem 31 

3. Empirical Bayes Estimation of Small Area Characteristics 

under Random Sampling Variances 37 

3.1 Introduction 37 

3.2 The Bayes Estimation of b{af) 38 

3.3 The Bayes Estimation of 0,* 41 

3.4 Empirical Bayes Estimation 42 

3.5 Second Order Approximation 43 

3.6 Numerical Example 45 

4. Empirical Bayes Estimation of Finite Population Means 57 

4.1 Introduction 57 

vi 



4.2 The Bayes and Empirical Bayes Estimation 59 

4.3 Measure of Uncertainty 60 

5. Empirical Bayes Estimation of Finite Population Variances 67 

5.1 Introduction 67 

5.2 The Bayes and Empirical Bayes Estimation of tth Strata Variance 68 

5.3 Measure of Uncertainty of ith Strata Variance Estimator 73 

6. Beta-Binomial in Finite Population Sampling ;. S7 

6.1 Introduction S7 

6.2 The Bayes Estimation of *th Stratum Proportion S7 

6.3 Empirical Bayes Estimation of tth Stratum Proportion S9 

6.4 Measure of Uncertainty of zth Stratum S9 

BIBLIOGRAPHY 91 



vii 



LIST OF TABLES 



page 



1. Comparison of Different Measures of Uncertainty of Empirical 

Bayes Estimates for the Baseball Data 25 

2. Comparison of Different Measures of Uncertainty of Empirical Bayes 
Estimates of Median Incomes of Pour- Person Family for 50 States and 

the District of Columbia for the Year 19S8 26 

3. Average Frequentist's Coverage Probability and Average Length 27 

4. Simulated Bayesian Coverage Probabilities for m = 20, a 2 = 1.0, r 2 = 1.0 . 28 

5. Simulated Bayesian Coverage Probabilities for m = 30, a 2 = 1.0, r 2 = 1.0 . 29 

6. Percent Average Relative Biases of MSE Estimators 30 

7. Average MSE of MSE Estimator 30 

8. Empirical Bayes Estimate 0f B using Numerical 

Integration and Laplace met hods 47 

9. Empirical Bayes Estimate af EB of af using Numerical 

Integration and Laplace methods 48 

10. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0,- and af where tj = 5 and £/r 2 = 0.25 49 

11. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0{ and af where ;/ = 5 and f /r 2 = 0.50 49 

12. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0i and af where 7 = 5 and £/r 2 = 1.00 50 

13. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 6{ and af where /; = 5 and £/r 2 = 2.00 50 

14. The Average Absolute Bias. Average Square Deviation. Average Relative 



vm 



Bias and Average Relative Square Deviation of Different 

Estimates of 0, and af where t} = 5 and £ jr l = 4.00 51 

15. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0; and af where rj = 10 and f/r 2 = 0.25 51 

16. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0,- and af where t} = 10 and £/r 2 = 0.50 52 

17. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 6} and af where 7] = 10 and f /r 2 = 1.00 52 

18. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0,- and af where r] = 10 and £/r 2 = 2.00 53 

19. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0j and af where 7 = 10 and £ /r 2 = 4.00 53 

20. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of Q { and af where tj = 100 and £/r 2 = 0.25 51 

21. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0* and af where rj = 100 and £/r 2 = 0.50 54 

22. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0* and af where 7/ = 100 and f /r 2 = 1.00 55 

23. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 0; and af where ;/ = 100 and £/r 2 = 2.00 55 



ix 



24.The Average Absolute Bias. Average Square Deviation, Average Relative 



Bias and Average Relative Square Deviation of Different 

Estimates of 0, and of where 7 = 100 and f /r 2 = 4.00 56 

25. Empirical Bayes Estimate ef B of 7,- where 7 = 10 and f = 4 63 

26. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 7,* where 7 = 5 64 

27. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 7,- where 17 = 10 65 

28. The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 7,- where 7 = 100 66 

29. Empirical Bayes Estimate ef B of 7,- where 7 = 10 and f = 4 S3 

30. The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of Different 

Estimates of 7,- where 7 = 0 84 

3LThe Average Absolute Bias. Average Square Deviation, Average Relative 

Bias and Average Relative Square Deviation of Different 

Estimates of 7,- where 7 = 10 So 

32.The Average Absolute Bias, Average Square Deviation, Average Relative 

Bias and Average Relative Square Deviation of Different 

Estimates of 7,- where 7 = 100 S6 



CHAPTER 1 



INTRODUCTION 
1.1 Literature Review 

Empirical Bayes method as an area of research has received considerable im- 
portance in recent years due to its proven optimality properties, simplicity and 
wide range applications. Empirical Bayes estimators have been found to be very 
effective in small-area estimation and finite population sampling, the topics of 
this dissertation. According to this method, the Bayes estimator of the unknown 
parameter of interest is first obtained by using a suitable Bayesian model which 
combines information from various sources. The unknown parameters of the prior 
distribution are then estimated by a classical method such as method of moments, 
method of maximum likelihood (ML), method of residual maximum likelihood 
(REML), etc. The resulting estimator is the so-called empirical Bayes estimator. 

Empirical Bayes method in the context of noaparametric estimation of a com- 
pletely unspecified prior distribution was introduced by Robbins (1955). Efron 
and Morris (1973 , 1975) proposed an empirical Bayes method in the parametric 
setting. Morris (1983) gave an excellent account of the empirical approach and its 
applications. 

An empirical Bayes estimator is usually obtained in a closed form and thus 
is appealing to the practitioners. One main criticism against the empirical Bayes 
method is that it does not provide a measure of uncertainty, specially for complex 
models, which captures all sources of variabilities. An estimator of the integrated 
Bayes risk of the Bayes estimator can be naively taken as a measure of uncertainty 
of an empirical Bayes estimator. But this measure can severely underestimate the 
true uncertainty of the empirical Bayes estimator since it does not incorporate the 
variabilities due to the estimation of various parameters of the prior distribution. 



There have been several attempts lo incorporate this extra uncertainty in the 
estimation procedure. Morris (19S3) proposed a measure of uncertainty for his 
empirical Bayes estimator by approximating the posterior variance of a hierarchical 
Bayes solution which assumes flat priors on the hyperparameters. He, however, 
considered a simple one way balanced random effects model. Kass and Steffey 
(1989) considered Laplace method to approximate a hierarchical Bayes solution. 
Prasad and Rao (1990) obtained a measure of uncertainty of their EBLUP (same 
as empirical Bayes) by first approximating the mean squared error (MSE) by delta 
method and then estimating the approximated MSE. Their method is, however, 
restricted to the ANOVA method of variance component estimation. Recently. 
Datta and Lahiri (1997) unified the Prasad-Rao theory for a general mixed linear 
model and the theory is valid for many methods of variance component estimation 
including ML and REML. Lahiri and Rao (1995) extended the Prasad-Rao method 
for non-normal situation. 

In this dissertation we shall develop empirical Bayes theory in small area esti- 
mation and finite population sampling. 

1.1.1 Small Area Estimation 

The sampling design and the sample size of most of the large scale national 
surveys are usually determined so as to produce reliable estimates of various char- 
acteristics of interest at the national level. Quite often there is a need to produce 
similar estimates at the subnational levels (e.g., states, counties, etc.). The direct 
survey method (see Cohran 1977) fails to provide reliable estimate for a subna- 
tional region due to small samples available from the region. Similar situation 
arises when estimates are needed for domains obtained by classifying the popula- 
tion according to various demographic characteristics (e.g., age, race, sex. etc.). 
Such problems in survey sampling literature are known as small area estimation 
problems. Reliable small area statistics are needed in regional planning and in 
allocation of government resources. The U.S. Census Bureau, the Bureau of the 



Labor Statistics, the Statistics Canada, the Ministry of Planning, and other fed- 
eral and local government agencies are interested in developing reliable small area 
statistics. See the working paper prepared by the subcommittee on small area esti- 
mation (1993) for a number of important small area estimation problems recently 
encountered by the U.S. federal agencies. 

The history of small area statistics can be traced back to the 11th century in 
England (see Brackstone, 19S7). Records of births, baptism, marriages, death, etc.. 
were used to produce various small area statistics. In those early days, sources of 
small area statistics were limited to various administrative records available from 
the local governments. 

In the past fifty years sample survey has become an important component in 
many countries' statistical programs. Sample surveys have been very successful in 
supplying national and regional statistical data on a regular basis. 

Due to budgetary constraints, it is not possible to collect adequate samples 
sizes from the small areas. When information on one or more relevant covariates 
is available, synthetic estimators, i.e., regression estimators, have been proposed 
in the small area literature (see Gonzales, 1973. Ericksen, 1974). Although the 
synthetic estimators have small variances compared to the direct survey estimators, 
they tend to be biased as they do not use the information on the characteristic of 
interest directly obtainable from the sample survey. A compromise between the 
direct survey and the synthetic estimation is the method of composite estimation 
(see Holt et al., 1979). Broadly defined, a composite estimator is a weighted 
average of a direct survey estimator and a synthetic estimator. The synthetic and 
composite estimators are usually obtained by implicit or explicit models which 
borrow strength from related resources. 

Morrison (1971) described small area estimation methods which were used prior 
to 1970. Purcell and Kish (1979) reviewed demographic methods as well as statis- 
tical methods of estimation for small domains. National Research Council (19S0) 
gave a detailed information as well as evaluation of the Census Bureau's procedure 



for making post-censual estimates of the population and per-capita income for lo- 
cal areas. Zidek (1982) introduced a criterion that could be used to evaluate the 
relative performances of different methods for estimating the population of a small 
area and McCullagh and Zidek (19S7) elaborated that criterion. Statistics Canada 
(1987) provided an overview and evaluation of the population estimation methods 
used in Canada. Schaible (1992) provided estimates on small area used in the U.S. 
Federal programs. For a review of the history of small area estimation, various 
small area estimation procedures and their applications, the reader is referred to 
Rao (1986), Chaudhuri (1992) and Ghosh and Rao (1994). 

Due to the growing demand of small area estimation, many symposia and work- 
shops on small area statistics were organized during the last two decades. The list 
of conferences and symposia includes National Institute of Drug Abuse, Princeton 
Conference (1979), International Symposium on Small Area Statistics, Ottawa (see 
Platek et ai (19S7) for the invited papers and Platek and Singh (1986) for the con- 
tributed papers), International Symposium on Small Area Statistics, New Orleans. 
19S8, Workshop on Small-Area Estimation for Military Personnel Planning. Wash- 
ington, D.C., 19S9 and International Scientific Conference on Small-Area Statistics 
on Survey Design, Warsaw, Poland, 1992. 

Empirical Bayes method has been extensively used in small area estimation 
and related problems. We now present a few specific application of empirical 
Bayes method in small area estimation and related problems. 

Carter and Rolph (1974) considered estimation of the probabilities of false fire 
alarms reported from many street boxes of New York City. Using data collected 
from the street boxes during 1967-69, they provided empirical Bayes estimates 
of the probabilities of false alarms at various street boxes for the year 1970. In 
order to estimate the probability of a false alarm for a given box, they combined 
information from that box and all the boxes in the neighborhood. 

Fay and Herriot (1979) generalized the Cartcr-Rolph model to incorporate in- 
formation on a number of covariatcs and proposed empirical Bayes estimates of 



per-capita incomes of small-places (population less than 1000). Their empirical 
Bayes estimator of per-capita income is a weighted average of the Current Popu- 
lation Survey estimator of the per-capita income and a regression estimator which 
utilizes tax return data for the year 1969 and the data on housing from the 1970 
census. 

Battese et aL (198S) considered the empirical best linear unbiased prediction 
(same as empirical Bayes) of areas under corn and soybeans for 12 counties in 
northern Iowa for the year 197S. They combined information from two sources. 
The direct information came from the 197S June Enumeration Survey. The L'SDA 
Statistical Reporting Service field staff determined the areas under corn and soy- 
beans in 37 sample segments (each segment is about 250 hectares) of 12 counties by 
interviewing farm operators. The second source of data was LANDSAT satellite 
data. Based on LANDSAT readings obtained during August and September of 
1978, Battese et aL (19SS) used the USDA procedures to classify the crop for all 
pixels (stands for picture elements extending over .45 hectares) in the 12 counties. 
A mixed linear regression model was then considered to establish a relationship 
between the survey and satellite data for the prediction purposes. 

Every ten years the U.S. Census Bureau undertakes a census to account for 
its population. Unfortunately, the census counts have been found to be imperfect 
despite the fact that the Census Bureau puts a lot of efforts to make this massive 
project successful. There are various reasons for this imperfection of the census 
counts. Researchers found that the accuracy of the census counts depends quite 
a bit on various demographic factors (e.g., age. race, sex, etc.), owners and non- 
owners (renters) of dwellings and the geography. Since the census counts are used 
to apportion congressional seats and allocation of federal funds in various federal 
programs, the differential undercount poses a serious problem. We referred to a 
special issue of Survey Methodology (see Vol IS. 1992) which contains papers of 
Cressie and Datta et aL discussing various issues and methods to adjust for the 
census counts. 



The department of Health and Human Services (HHS) uses estimates of the 
median income of four-person families at the state level to formulate its energy 
assistance program for low income families. Such data are provided by the U.S. 
Census Bureau for all the states and the District of Columbia on an annual basis. 
Currently the Census Bureau uses an empirical Bayes estimator based on the work 
by Fay (1987) (see also Fay et aL 1993; Datta et aL 1991; Datta et aL 1996: Ghosh 
et aL 1996). These papers used data from annual demographic supplement to the 
March sample of the Current Population Survey (CPS) which provides annual 
median income by states and family sizes, the decennial censuses and the Bureau 
of Economic Analysis which provides annual estimates of per-capita income for the 
states. 

1.1.2 Finite Population Sampling 

The results from the empirical Bayes estimation for small area characteristics 
can be adapted in estimating finite population means and variances from stratified 
simple random sampling. For a traditional design based approach to the finite 
population sampling, the reader is referred to Cohran (1977). Ericson (1969a) 
put forward an elegant formulation of the subjective Bayes approach to the finite 
population sampling. In his approach, he first assumed that the finite popula- 
tion is a realization from a hypothetical population which is the usual assumption 
in the super-population approach in finite population theory (see Royall 1970). 
At the second stage, Ericson (1969a) assumed a subjective prior distribution on 
the parameters of the super-population model. In practice, it is generally diffi- 
cult to apply Ericson's Bayesian method since the prior parameters arc hardly 
known. Ghosh and Meeden (19S6) considered an empirical Bayes approach under 
a stratified simple random sampling, using an one-way random effects model. They 
successfully demonstrated that their method can be very effective in repeated sur- 
veys and small-area estimation. Their empirical Bayes estimator is asymptotically 
optimal in the sense of Robbins (1955). Later on Ghosh and Lahiri (19S7) relaxed 



I 



the normality assumption of Ghosh and Meeden (1986) and showed that Ghosh- 
Meeden estimator is robust under the assumption of posterior linearity (see Ericson 
1969 b; Goldstein 1975; Hartigan 1969). The Ghosh-Meeden empirical Bayes es- 
timator can also be motivated from the best linear prediction approach of Prasad 
and Rao (1990). Nandram and Sedransk (1993) extended the Ghosh-Meeden es- 
timator under different but random sampling variances. Recently, Arora et ai 
(1997) considered an alternative to the Nandram-Sedransk method. Their method 
can incorporate relevant auxiliary information which may be available from vari- 
ous administrative records and censuses. They also proposed, for the first time, a 
measure of uncertainty of the empirical Bayes estimator of finite population means 
which can incorporate uncertainty clue to estimation of all the parameters in the 
Bayesian model. Their method is an extension of the parametric bootstrap method 
proposed earlier by Laird and Louis (19S7) to the finite population sampling. 

For the last fifteen years, there has been a growing demand from both the public 
and private sectors to produce reliable statistics for various subgroups of a finite 
population. According to Brakstone (19S7) "there is in Canada, and probably in 
other countries too, an increasing government concern with issues of distribution . 
equity and disparity." Consider the problem of comparing the income distribution 
for various geographical areas of a country. Is it enough to consider just t lie per- 
capita income? Probably not. since two geographical areas may be comparable 
in terms of their per-capita incomes, yet they may vary considerably in terms 
of diversity which can be measured by the variances of their income distributions. 
Although the problem of finite population variances for different geographic groups 
is a very important problem, it has received relatively less attention than the 
problem of estimation of means, ratios and proportions for different geographical 
areas in the finite population sampling. 

Ericson (1969a) briefly addressed the problem of the Bayesian estimation of 
a finite population variance under simple random sampling. Datta and Ghosh 
(1993) provided a unified approach to the Bayesian estimation of different strata 



variances in finite population sampling under stratified random sampling. Ghosh 
and Lahiri (1987) considered the problem using a linear empirical Bayes approach. 
Lahiri and Tiwari (1990) proposed a non parametric empirical Bayes estimation 
using the Dirichlet process prior (Ferguson 1973). 

Note that the model considered by Datta and Ghosh (1993) does not incorpo- 
rate stratum specific random effects through the scale components. Although, this 
synthetic assumption may have insignificant effect in the estimation of different 
stratum means, it may cause unduly shrinkage in the Bayes estimator of different 
stratum variances. Ghosh and Lahiri (1987) and Lahiri and Tiwari (1990) intro- 
duced random stratum effects through the scale parameters, but even then they 
failed to overcome the overshrinkage problem primary because of the linear nature 
of their Bayes estimators. However, we realize that the linear empirical Bayes 
procedure of Ghosh and Lahiri (1987) and the nonparametric empirical Bayes ap- 
proach of Lahiri and Tiwari ( 1990) are very robust and it is difficult to resolve the 
problem associated with overshrinking without being specific about the distribu- 
tion of the stratum specific random scale effects. 

1.2 The Subject of This Dissertation 

The organization of this dissertation is as follows: 
In Chapter 2, we present a unified approach to empirical Bayes estimation in 
small area estimation and related problems. A simple measure of uncertainty 
which incorporates all sources of variations is proposed and hence the chapter 
addresses an outstanding problem in empirical Bayes estimation. Using a simple 
small area model, we show that our proposed measure of uncertainty enjoys both 
the frequentist and Bayesian properties. The method is also validated using real 
life examples and Monte Carlo simulations. 

Small area estimation under random sampling variances has recently received 
a lot of attention. In Chapter 3. we consider the estimation of both small area 



9 



means and variances under a random sampling variances model. We use Laplace's 
second order approximation to obtain closed form formulae for the Bayes and 
empirical Bayes estimators. We demonstrate the accuracy of the approximations 
using numerical examples. In order to compare the performances of the proposed 
empirical Bayes estimators, we carry out Monte Carlo simulations. 

We address empirical Bayes estimation in finite population sampling in Chap- 
ters 4, 5 and 6. In Chapters 4 and 5, we obtain useful approximations to the 
estimators earlier proposed by Arora (1994). In these chapters, we also consider 
the important problem of 1 measuring the uncertainty of the proposed empirical 
Bayes estimators in finite population sampling. In Chapter 6, we consider em- 
pirical Bayes estimation of finite population proportions when samples arc drawn 
using a stratified simple random sampling design. 



10 



CHAPTER 2 
Estimation in a Mixed Linear Normal Model 
2.1 Introduction 

The main objective of this chapter is to develop a simple measure of uncer- 
tainty of an empirical Bayes small area estimator under a fairly general longitudi- 
nal mixed linear model. The proposed model covers many important small area 
models considered earlier in the literature including the Fay-Herriot and the nested 
error regression models described in section 2.2. In section 2.3, we introduce the 
general model and consider empirical Bayes point estimation of a general mixed 
effect. The empirical Bayes point estimator is identical with the empirical best 
linear unbiased predictor (EBLUP) proposed by Prasad and Rao (1990). Despite 
the popularity of the empirical Bayes method, the literature on empirical Bayes 
measure of uncertainty is not very rich as explained in Chapter 1. In section 
2.4, we propose a unified measure of uncertainty for the proposed empirical Bayes 
estimator. Two reed life examples are considered in section 2.5 to illustrate our 
method. Finally, in section 2.6. a Monte Carlo simulation is performed to validate 
the proposed measure of uncertainty of the empirical Bayes estimator. Proofs of 
all the theorems are given in the appendix to this chapter. 

2.2 Two Models 

2.2.1 Fay-Herriot Model 

Fay and Herriot (1979) considered empirical Bayes method* to estimate per- 
capita incomes of small places (population less than 1000). In order to obtain 
their empirical Bayes estimation, they used the following model: (i) Y] | 0 t 

A), t = l,..,m; (ii) Apriori, 0 t W N[x\fi % A), i = 1 m. where Y] 



: • : - 



n 

is the survey estimator of pcr-capita income for the zth area. A* is known the 
sampling variance of Yi\ Xi = (jra,Xft, ...,?,**)' is a vector of known benchmark 
variables obtainable from the 1969 tax return data and 1970 census data. 

2.2.2 Nested-Error Regression Model 

Battese et al. (19S8) proposed a nested-error regression model to predict areas 
under corn and soybeans for 12 counties in Northern Iowa. They assumed that 
Vij = A) + Pixuj + fcxnj + «Oi where i is a subscript for the county (i = 1. 12) 
and j is a subscript for a segment within a given county (j=l,...,n,-, where n ( is 
the number of segments in the ith county). Here yij is the number of hectares 
under corn (soybeans) in the jth segment of the ith county as reported in the June 
Enumerative Survey; xuj is the number of pixels of corn and x 2 »j is the number of 
pixels of soybeans for the jth segment in the ith county and /? 0 , 0u 02 are unknown 
parameters. The covariates xuj and x 2 ij were obtained for the sample segments as 
well as for non sample segments using Landsat Satellite data. The random error 
associated with the reported crop area is expressed as = tfi+e,-y, where u,- is a 
random effect due to the ith county and e,*; is a pure error for the jth segment in the 
ith county. They assumed v.- - N{0.°% c« U = i = 1» 12) - A'(0.<r*). 

Cw{vuta) = 0, Cw[vi % v v ) = 0 if (i # i') and Cov(t ih e V y) = 0- if (ij) # (i'./). 

Those two models discussed above are special cases of the general mixed linear 
model described in section 2.3. 

2.3 Empirical Bayes Point Estimation 

Let Xi and Z, be m x p and n,- x matrices of constants. Let n = n; 
and A; = Consider the following Bayesian model: 

Model 1: 

(i) Conditional on a fc,- x 1 random vector Yi's are independent with 
^|^-^n.(A^ + Z,^, ft J, i=L... t ro; 



12 



(ii) Apriori, Ui W ,V*,(0. C.), i = 1 nr. 

where R+ = Ri(i^) and C7, = G,(0) are respectively n, x n, and k { x Ay 
matrices which possibly depend on i\ a s x 1 vector of variance components. 

Consider the estimation of 0,- = /{ 0 + X\ (',, where and A,- are p x 1 and 
ki x 1 vector of known constants respectively. 

From Model 1, first, we need to find the distribution of Ui | Yi. Now. in the 
joint density f(Y,U) is 

«cxp -|E{(* " M - ZiUiYRr'iYi - X0 - ZdJi) + tf.GT'f',} (I) 
To find the distribution of Ui | Y), look at the exponent of (1), terms involving (',• 

"£{(Yi ~ Xi,3 - ZjUi)'R~ l (Yi - XiJ - ZiUi) + f/.GT'C',} 
= - XfflR-W ~ Xifi) -2 - XifltfZiUi 

+ ZV'iZ'iRT'ZiVi + lZUiGr'Ui 

= *£(Yi - Xi(JYR~ l (Yi - X { 0) -2 Y.W - XiftR-'ZiUi 

+ EWr 1 + Z\Rj'Zi)Ui 

= £(* - XifiYRr\Yi - Xi0) 

+ " ( G ." 1 + ZlR-'Zir'Z'iRr^Yi- Xtf)}' {G- 1 + Z/flr'Z.-) 

{ft - (GT l + Z' i Rr l Z i )- l Z' i RT l (Yi-Xi0)} 
- £(?; - Xi0YR- l Zi(G- 1 + ZlRT'Zir'ZlRr^Yi-X^) (2) 

Hence, the distribution of £/,- j 1? is normal i.e., 

Ui\Yc,0,ii,~N[{GT l + zf/zr'z.r'z^r'oi-A'^^Gr 1 +. z^- i z,)-»] 

Claim that (Gr l + Z\R~ X Z { )- x Z\RJ 1 = C,-Z?VT l . 
Proof: 

K" 1 = (/?. + z,c;,z;r' 

= RT X - R7 l Zi{ZlRT l Zi + GT x r x Z\RT l 



= RT l - RVZ^Rr'ZiG, + I ki )G; l \' X Z[R- 1 
= R- 1 - Rf l ZiGi(Z'jRf 1 Zid + I ki r l ZlR7 l 
= RT l - Rr x ZiG-LZ\R- x 

where E = (Z/i^Z.G, + /*,)-', E" 1 = Z\R^ZiGi + I ki , thus Z^r'Z.G, = 
E -1 - /fc; , (/*, being the identity matrix of order A; x A,). Now, 

Z/ K" 1 = Z//?-' - Z'iRr'ZiGiZZlRT 1 

= z;/?- 1 - (s- 1 - / t j e z;/2r ! 
= z;«r' - z';Rt 1 e z;/?r l 

= (Z^r'Z.G,- + I k ,)- l Z[RT l 

then 

G.-Z/Vr 1 = GdZlR-'ZiGi + /tj-'z^r 1 

= [Wz.-s, + /.jGr 1 ]" 1 ^' 
= j(z;/?r l Zi + cr'JpW 

Then, the probability distribution of #i | J'; can be written as 



Under the above model and squared error loss function, the Bayes estimator of 
6i is given by 

§i B = £[<?,- 1 = + ^ G,(o) Z[ Vr x {1>) (V- - A'./*) 

= te;/3,^),say. (3) 

where V^) = R, ; + Z, G, Z/ (/ = 1, . . . , m). 

When 0 is known," but 0 is unknown. # is estimated by the maximum likelihood 
estimator 4(0), where 0(0) = A'/ A';]"' A? K" 1 ^) ?}] . 

Plugging in 0(if>) for /? in Of, we get the following empirical Bayes estimator of 
Or. 

0f B = OdK-Jwiv) = l' ; :j(u>) + \\Gi{v)Z[Vr\4')[Yi - XiMv)]. (I) 



Note that 0f B is a robust estimator in the sense that it can be viewed as a best 
linear unbiased predictor (see Prasad and Rao 1990) or a linear empirical Bayes 
estimator (Ghosh and Lahiri 1987). 

In practice fi and 0 are both unknown. In this case, an empirical Bayes 

* EB - - - 

estimator of 0 t - is obtained as 0, = 0,-(V};/9(^), 0), where 0 is an estimator of 
i> which satisfies the regularity conditions of Datta and Lahiri (1997) given in the 
Appendix. We shall assume that E(4> - tf>) = —m~ l B(rJ>) + o(m~ l ), where the 
functional form of B(il?) is known. 

Example: Model 1 covers the Fay-llerriot example in section 2.2.1. 
Fay-Herriot Model (see Fay and Herriot 1979) 

(i) Yiiei&NWfi + ebDi), i = l t ...,m: 

(ii) Apriori, 0,* '~ iV(0, A), i = L, . . . , m; 

where A's are known and xfs are p x 1 vector of known constants. In the 
notations of the Model 1, m = Ar, = I, Z, = 1. Ui = 0,*, 0 = A, = A* and 

In this case, we may consider the following empirical Bayes estimator of 0,*: 

*f* = *^ + (i - £,) (>•; - x{4), 

where ft = fa *'*0" (2S. 

/Umax (0, i = (V ; _ x ;. ^,1 _ £r=1 (l _ hii ) D;]), 

this case, B(rj>) = 0. 

For other particular cases of Model 1, see Carter and Rolph (1974), Fay and 
Herriot (1979), Battese et al. (19SS). Lahiri and Wang (1992), among others. 

~ EB 

2.4. A Measure of Uncertainty of 0, 

Define the integrated Bayes risk of Of 8 as r{0f B ) = E{9f 8 - 0,) 2 , where the 
expectation is taken with respect to Model I. Note that r(0f B ) is identical to the 



lo 



MSE of Of* as defined in Prasad and Rao ( 1990). A measure of uncertainty of 0. 
is given by 

r(0f B ) = E(6f B -0;) 2 

= E{i\m + x'A - i',3 - mf - 

= E{l\Cm - 0) + A{(& - Ui)} 2 

= Par ) - /?)] + Var - U { )] 

+ 2l' ; Cou{{l' i (0(4>) - 3),\\{Ui - Ui)} 
= f t \jbXiYr l i+) Xi)- l li + KVarmXi 

1 = 1 

+ y i Var[U i )\ i - 2X' i Cov{U i ,U[)K 

+ 2l\Cov(.l0l)\i - U'iCoo&UftXi. (5) 

Note that Ui = G.-(V') The third term of (5) is A<<7ff»A,-. Now. the 

second term of (5) is 

Kar(£/,) = Var {Gi^Z'^mY |- A'.-^VO)} 

= Far { G' ; (0)2;v;.-(0){y; - a,[£; .vjvr'W Wl^E )*51) } 

1=1 1=1 J 

= Var{G,(0)Z;K- l (0)>V} 

m m 

1=1 1=1 

( mm 

YfViWZiGiW] 
= GiWWVr^mZiGiW) 

+ ^ar{c i #)2r?Vr , («' 1 )A'.-E XjVr'WW'tW W. + ••• 
I 1=1 

+ A';v- , (0)>; 1 )} 

- 2 Cov { C-^^VrV W-}-Yi(f; A'/r l (0)A.]- 1 



16 



i=i 
1=1 

= GiU>)ziv- i w){i ni - Xii£x!vr i (*)Xir l xivr i w}ZiGt(*) m 

1=1 

The fourth term of (5) becomes 

Cao{Ui,U[) = Cou - Xi3(tl})), 

= Cot, {GiC^K-^^tA'^ + W + c,- 

i=l 1=1 

= Cov {GiU>)Z' ! Vr l (il>)[X i l} + Z.tf, + ti 

-Xifi - x£xivrH*)XA- l (£x;vr l [*)ZiUi) 

i=i i=i 
i=i i=i J J 

I 1=1 

= Cew iGMWVrWVn, - Xilf^X'iVr^Xi}- 1 

X' { V- l (+)]ZiGiU>) (7) 



Since in the above e(s and C/,'s are independent and U-s are also independent. The 
fifth term of (5) is 

Cov&U!) = Cov {faGdWrH*M-Xi0(+))Y} 
= Cov {pXV-\t)ZiGM)) 

- cov {0j(4>)XiVr l (mG t (ii>))} 

m 

= £ A'/K.-'fcJA',]-' .^-'(tf )ZiCi(0) simi/nr ro f/ie third term 
i=i 



-[E^K" , (^)A',r l A';i/- , (t.)Z.G I (0) 
1=1 

= 0 (S) 
The sixth term of (5) is 

f m m \ 

I 1=1 i=l J 

= cat, (e AJIT '(W L .^w + + f.i £•;} 

V i=l i=I J 

= E^^i^-'^wiZiCw (9) 

Now, using (6) — (9), (5) becomes 

r(*f*) = JJE*? tf'W A',]-'/.- + AjOiWAi 

1=1 

m 

+KG i {ti>)z\yr x {*){in. - .v,E .W l WWWW}^i(^)A,- 

i=l 

w 

-2 AJGiW^IT^^i/-. - JfiEWWWWlWlAi 



1=1 



1=1 

m 

{A, - XdY.XlV-HnWW- 1 (>!>)} ZiGMX; 



1=1 

m 



i=l 



t=l 



+A;a(^)z;v;- , (v/oA^EA';vr , (vox]- , A';K- 1 (0)^G i (^)A i 

i=l 

+ /;[f;A';i/- i (v)A ; j- , / < - 

i=i 



18 



+ (i i -x;v-\*)zmmY(f:XiV- l (ri>)x i )- 1 

i=l 

(li-XlVr l fflZi)Gi(*)\i) 
= 9u(1>) + ftity), say. (10) 

where g u W = A< Gift) [I ki - Z\ Vr'W Z> G^)] A,-, (A, being the identity ma- 
trix of order fcxfc), = (/.•- .V/ 2i ft W)Af)'(E£i 

(/; - AT/ VTW £ (7,(0) A,), (see Kackar acid Harville 19S4; Prasad and Rao 
1990). An appropriate estimator of r(0f B ) can be taken as a measure of uncer- 
tainty of Qf B . First note that 

r(0 { EB ) = guW + gM+ E[Q* B - 0**?, (11) 

Note that gu(i>) is the measure of uncertainty of the Bayes estimator Of , gtiltl 9 ) is 
the uncertainty due to estimation of & and the third term is due to the estimation 
of 0. One may naively approximate r(0f B ) by guty) + g2i(*l>) which ignores the 
uncertainty due to estimation of </\ Datta and Lahiri (1997) showed that under 
regularity conditions (RC), 

E(df B -d? B f = g 3i (0) + o(m~ l ), (12) 

where ^(0) = trace [MV')K(^)^(v)E(^)], W) = col x < s < 9 L{-(^) = 

(AJffi(0)2?l^- l W)) and E(0) = W - - il>)'. The expression for 

E(^) for some standard methods of estimation of xj) (e.g., ANOVA, ML, REML) 
are given in Prasad and Rao (1990) and Datta and Lahiri (1997). Thus, the naive 
approximation, i.e., guW + </2i(^). could lead to a serious underestimation since 
g3i[i>) is of order 0(m~ l ), same as the order of <fei(0). For this reason, in this 
paper we shall not ignore any term of order 0(m~ l ). A naive estimator of r{0f B ) 
is obtained from the naive approximation and is given by V { N = guW + ff2i(i^)- 
It follows from Datta and Lahiri (1997) that under regularity conditions (RC), 

E {gu(i) + g 2 i(d>)} = guW) +g*iv) - nr l B'(v)Vg u (tl>) ~g 3i (4>) +o(m- | ).(13j 



If) 



where Vg u (t/>) = col x <j< s ^ gu(tl'), and estimated gu(if>) + g 2 i{^) in (ii) by 

guW+gnW + m- l B'{ii>)V9u{4>) + gziih where 5(^V 5ll ^) and 

are obtained from f?(r/')> V^i,-(^) t and <73i(V T ) respectively when ifiis replaced by t\ 

To estimate r(0f 5 ) in (11), we shall estimate jii(V>) + g 2 i{i>) by the Datta- 
Lahiri or Prasad-Rao method. However, our approach differs from the Prasad-Rao 
or Datta-Lahiri method in the estimation of the third term of (11). To introduce 
the method consider the following bootstrap model: 

Model 2: 

(i) Yt\Ut i #N n AXM) + Z i Ut, Rih / = 1 m; 

(ii) Apriori, U* *~ N ki (0, G\), m; 
where ft = ft(0) and (7/ = G\(V»). 

A reasonable estimator of the third term of ( 11) is • <?'*) - 

0»(?f; £*(V>) , ^) ] 2 , where is the expectation with respect to Model 2, tf w ( e*) = 
(£!=i*; tr l (r) i?) and the calculation of is the 

same as that of ^ except that it is based on Vj w, s instead of i'/'s. It is shown in 
Theorem A.l that: 

EtfiiYrJW) J*) - WrJ'WJ)] 2 = + * P (™" 1 ) (U) 

where g 4i (^ Y { ) = <race[£;(0) [>V - XMWi ~ XM)\L'M>) U4*)l 
Thus, we now propose the following estimator of r(df B ): 

Vi p =g u (j>) + m^B'MVguW) + gvti) + g 3 M) + gM^l (15) 

Laird and Louis (19S7) proposed a measure of uncertainty of an empirical Bayes 
estimator for a very special case of Model 1, specifically the Fay-Herriot model with 
x\$ = fi and Di = D (i = 1, • • • t m). For the general model , one may extend their 
measure as V { LL = E„ [g U {4> m j\ + K, 0 9 {4' w ),4>*)], where and V m are the 

expectation and the variance with respect to Model 2. 

In Theorem A.2, we showed that under Model I and regularity conditions (RC). 
E[Vf L ) = r(0f B ) - 2m- l ff(*)Vguiu)-toiW) + o(m' 1 ). (16) 



20 



but 

E[Vn = r(0? B ) + o(m->). (17) 

Thus, unlike V { p t Vf L could lead to an underestimation of r(0f B ) since the order 
of bias for l£" is 0{m' 1 ). 

Remark 1. Consider a special case of the Fay-Herriot model when x\J = /* 
and D{ = D (i = L - ? m). An empirical Bayes estimator of 0, is given by 
§i BB = Y + (1 - ft) (tf - F). where F = tf, ft = nun(1.0 s fl) and 

i? = Z^p - ( see Prasad and Rao 1990: Laird and Louis 1987). Using a flat 

improper prior distributions on fi and B = A & g 1 Morris (1983a, 83b) suggested 
an approximation of the posterior variance of Q t as a measure of uncertainty of 0f B . 
See Kass and StefFey (19S9) for alternative Bayesian approach, using the Laplace 
approximation. The measure is given by 

V? = (I-5,)Z? + + -^(« - F) 2 . (IS) 

m »? — 3 

To get the proposed measure of uncertainty, we shall solve r(0f B ) i.e.. 

r(0f fl ) = £.(*f s - 0,-] 2 = E[0f B - OA 2 + E.[0f B - 0f B } 2 (ID) 

The first term of (19) is 

E[0f B - 0,f = E[Y + (l-B l )(Y i -7)-0 i ] 7 

= E {(F-/t) + (l - S,)[(V; - /z) - (F-/i)] - (0 t - - ,,)}* 
= E {B,(F - /i) + ( 1 - B x )( V; - ,1) - (0, - ^)} 2 
= 0?£(F- /<) 2 + (1 - £?,)'£(* - /x) 2 + (0, - p) 2 

+2S,(l-5,)£:(F~/i)(VJ-/i) . 

-2 B x E(7-n)(9i - fl ) - 2(1 - £,)£(?; - /<)(0, - /i) 

-2fl,£{(0; -/*) | 0i) 

-2{l -By)E{(O i - l i)E(Y i -n)\0 t ) 

BiD A 2 2 AD A 

= - J — + TV"?? + A + r— -pr — '1B\ — — 2(1 - B t )A 

m A+D mA+D m v ' 



21 



B X D A 7 , IAD 2AD 2 A* 

+ . , ~ + A + 



m, A + D m(A + D) m(A + D) A + D 



m A + D 

= + — (20) 

Note that the last term of (20) are estimated by 0(1 - £,) + + ^7^. The 
second term of (19) is 

Eitff* - Of 8 ] 2 = E.[Y - ( 1 - B\) [Yi - F) - Y - (1 - A ) (Yi - F)J 2 
= E4-{B; - Bx) (Yi - F)] 2 
= - F) 2 E,[Bl - B x ) 7 
= (Yi - F) 2 E W (B\) 7 + Bf - B*B X 

= (Yi - F) 2 £.{(m - DB^y + Bl- lBx E.{(m - l)B^} 

= (K - F) 2 J?., - (21) 

where u in the above is chi-square with m — I degree of freedom. Using (20) — (21 ) 
then r{0f B ) or the proposed measure of uncertainty 

Vr = (1 - A,)/) + H^L + SLll *L + liL ( y: _ 7)2 (22) 

m m ~ I »i m - 5 

Similarly the Prasad-Rao ( 1990) measure of uncertainty is given by 

K P * = (1 -0,)0 + ^A + i^A, (23) 

rn m 

The measure of uncertainty of Of B using the method of Kass and StefFey ( 19S9) is 
given by: 

Vi KSi = (1 - B X )D + + ^V(V; - F) 2 , (24) 

m m — I 

One can also adjust the Kass-StefFey measure of uncertainty for order 0(m~* ) bias 
and get 



yKSt! _ yKSi + 



(25) 

m 



Laird and Louis (1987) proposed the following measure of uncertainty of Of B : 



m — o tti m — o 



(26) 



Thus, and V-^ are identical upto order o p (m~ l ). However, the difference 
between Vf* and V ; LL is of order 0 p (m" 1 ). The Prasad- Rao measure V] pn cannot 
match a hierarchical Bayes solution since a hierarchical Bayes solution must be of 



the form (1 - E(B | Y)) D + D W' + (Yi - Y) 2 V(B | K), where E(B | K) 



and V(B \ Y) are the posterior mean and variance of 2?, under a suitable prior on 
the hyper-parameters /* and B. We emphasize that the Prasad-Rao method gives 
exactly the same measure of uncertainty for all the small-areas unlike the other 
methods since the Prasad-Rao method does not depend on the individual Yi. 
Remark 2. It is shown in Theorem A.l that V^ L = gu(i>) + #2«(0) + £!.•(<?; Yi) 
- m- l B'(il>)Vg u ty)- g 3i (ij>) + o p (nr l ). Since E ^,-(0; Y { ) = g 3i (if>) + o(m' l l 
it is quite possible that for some /. V] LL could give us a measure which is less than 
the naive measure t^ /V (at least for large m). 

For general case of the Fay-Herriot modeL the proposed measure of uncertainty 
of Of B is given by 




Note that in this case, the Laird-Louis method can be approximated by 




(1 - B t )Di + Bf x'i(E D^BiXiX^Xi 




(28) 



2.5. Two Examples 



Efron and Morris (1975) successfully demonstrated the superiority of the em- 
pirical Bayes estimator (see section 3) over the classical estimator V; using the 



famous baseball data which contains the batting averages of IS major league base- 
ball players. It is instructive to compare various measures of uncertainty of their 
empirical Bayes estimator. Table 1 presents various measures of uncertainty given 
in Remark 1. Amount of inflation of the measures which incorporate the uncer- 
tainty due to estimation of A is substantial when compared with the naive measure 
Vf. Note that Vf* and Vf* are constant (0.341) and (0.153), respectively for all 
the baseball players. All the other measures of uncertainty change from player to 
player since they depend on the individual >}. 

For the above example, m = IS may be considered to be small. We now consider 
another example where m = 51 is moderately large. The U.S. Department of 
Health and Human Services (HHS) uses estimates of median income of four-person 
families at the state level to formulate its energy assistance program to low income 
families. Such data are provided by the U.S. Census Bureau for all the states 
(including the District of Columbia) on an annual basis. The current estimates are 
produced by an empirical Bayes procedure (see Fay 1987; Fay et ai 1993; Datta ct 
ai 1996; Ghosh et ai 1996). The data we analyze provide the usual design-based 
estimates of 4-person families ( VJ) and its sampling variances D x for all the 50 states 
and the District of Columbia for the year 19SS. As a demonstration, in order to 
produce empirical Bayes estimates, using the Fay-Herriot model, we choose 1979 
census estimates of median income of four-person families updated by the change 
in per-capita income obtainable from the Bureau of Economic Analysis. Our focus 
here is on the comparison of different measures of uncertainty of the empirical 
Bayes estimator. Table 2 presents the standard error of Y { (i.e., y/ITi). \/vf*. 




\J /K" and y/Vf. All the different measures of uncertainty of the empirical 

Bayes estimates are smaller than the measure of uncertainty of Yi* the design- 
based estimates of the median income of the 4-person families. In fact, there is a 
considerable gain in using empirical Bayes estimator. Generally, both V S PR and V? 
are more conservative than Vf* L . It appears that Vf R is generally slightly more 
conservative than Vf 






2.6. Simulation Experiment 



In this section, we conduct a simulation experiment to validate the the proposed 
measure of uncertainty and also to compare the proposed measure with other rival 
methods. We consider two values of nu e.g., m = 20 and m = 30 and consider three 
different combinations of (<r 2 ,r 2 ) so as to cover all the three cases: (i) <r 2 /r 2 < 1. 
(ii) a 2 jr 2 = 1 and (iii) a 2 It 2 > 1. We generated 10,000 independent 0, from a 
normal distribution with mean zero and variance a 2 = a 2 and then for each 0, we 
generated yi from a normal with mean 0,- and variance r 2 , (i = I, ...,772). For each 
simulation we found the confidence intervals i.e.. ef B ±Sms\/W* where z ms is the 
upper 2.5% point of standard normal deviate and j = N, LL, AT, PR, Proposed and 
checked whether 0; belonged to the confidence interval (i = 1, m). We report the 
average coverage probabilities and the average length for each method on Table 3. 
In order to investigate the Bayesian coverage, we simulated data only once to find 
the confidence intervals by various methods. We then generated 0 t - 10,000 times 
from normal with mean is (1 - Bi)fi + B x yi and the variance is r 2 a 2 /{r 2 + <r 2 ) see 
whether 0, belonged to the interval {ef B ± z m $\/vj}. The results are reported in 
Table 4 and Table 5. In Table 6 we present the relative biases of different MSE 
estimators which are calculated by lOOx [average E(estimator of MSE) - average 
MSE]/(average of MSE), where the average is over all the small areas and E denotes 
the simulated average. In Table 7 wc present the average simulated MSE of MSE 
estimators i.e., E{V/ - MSEi} 2 , where V/ is the MSE estimator of the zth small 
area for j = N, LL, A/, PR, Proposed. 



Table 1: Comparison of Different Measures of Uncertainty of Empirical Bayes 
Estimates for the Baseball Data (Efron and Morris, 1975) 



Player 


Naive 


KS I 


KS II 


Prasad 


Laird 


Proposed 


Name 








- Rao 


- Louis 




Clemente (Pitts,NL) 


.153 


.519 


.618 


.353 


.646 


.725 


F. Robinson (Bait, AL) 


.153 


.416 


.515 


.353 


.512 


.590 


F. Howard (Wash, AL) 


.153 


.327 


.427 


.353 


.396 


.47-1 


Johnstone (Cal, AL) 


.153 


.255 


.351 


.353 


.301 


.380 


Berry (Chi, AL) 


.153 


.202 


.301 


.353 


.232 


.310 


Spencer (Cal, AL) 


.153 


.202 


.301 


.353 


.232 


.310 i 


Kessinger (Chi, NL) 


.153 


.168 


.26S 


.353 


.IS8 


.266 ; 


L. Alvarado (Bos, AL) 


.153 


.154 


.253 


.353 


.169 


.248 


Santo (Chi, NL) 


.153 


.162 


.261 


.353 


.179 


.258 


Swoboda (NY, NL) 


.153 


.162 


.261 


.353 


.179 


.258 


Unser (Wash, AL) 


.153 


.191 


.291 


.353 


.218 


.297 


Williams (Chi, AL) 


.153 


.191 


.291 


.353 


.218 


.297 


Scott (Bos, AL) 


.153 


.191 


.291 


.353 


.218 


.297 


Petrocelli (Bos, AL) 


.153 


.191 


.291 


.353 


.218 


.297 


E. Rodriguez (KC, AL) 


.153 


.191 


.291 


.333 


.218 


.297 


Campaneries (Oak, AL) 


.153 


.249 


.348 


.353 


.293 


.372 


Munson (NY, AL) 


.153 


.332 


.432 


.353 


. .402 


.481 


Alvis (Mill, AL) 


.153 


.451 


.551 


.353 


.558 


.636 



26 



Table 2: Comparison of Different Measures of Uncertainty of Empirical Bayes Es- 
timates of Median Incomes of Four-Person Family for 50 States and the District of 
Columbia for the Year 1988 



State 




Naive 


Prasad 


Laird 


Pro 


State 




Naive 


Prasad 


Laird 


Pro 


No. 






- Rao 


-Louis 


posed 


No. 






- Rao 


-Louis 


posed 


1 


2183 


1265 


1284 


1266 


1276 


27 


1765 


1166 


1186 


1173 


1183 


2 


3248 


1426 


1439 


1426 


1433 


28 


2632 


1335 


1352 


1336 


1344 


3 


2908 


1371 


1387 


1374 


1382 


' 29 


3790 


1442 


1454 


1444 


1450 


4 


1989 


1246 


1266 


1248 


1258 


30 


1577 


1109 


1128 


1134 


1143 


5 


3040 


1393 


1408 


1446 


1453 


31 


3094 


1380 


1395 


1381 


13S8 


6 


3655 


1470 


1482 


1474 


1479 


32 


3089 


1385 


1400 


1386 


1393 


7 


1972 


1231 


1250 


1232 


1242 


33 


2766 


1350 


1366 


1369 


1377 


8 


1705 


1173 


1192 


1179 


1189 


34 


3006 


1368 


1384 


1371 


137S 


9 


1636 


1129 


1149 


1133 


1143 


35 


2031 


1226 


1245 


1263 


1272 


10 


1576 


1108 


1127 


1108 


1118 


36 


2723 


1342 


1358 


1343 


1352 


11 


3037 


1385 


1400 


1390 


1397 


37 


3122 


1381 


1395 


13S3 


1390 


12 


1642 


1135 


1154 


1138 


1148 


38 


2492 


1316 


1334 


1317 


1326 


13 


1744 


1166 


1185 


1167 


1176 


39 


2586 


1324 


1342 


1342 


1350 


14 


1718 


1154 


1174 


1178 


1188 


40 


2867 


1356 


1372 


1356 


1364 


15 


2644 


1348 


1365 


1349 


1357 


41 


2685 


1335 


1352 


1354 


1363 


16 


2169 


1260 


1279 


1264 


1274 


42 


3116 


1395 


1410 


1403 


1410 


17 


3587 


1427 


1439 


1432 


1438 


43 


2733 


1340 


1357 


1340 


1348 


18 


2301 


1278 


1297 


1314 


1323 


44 


3951 


1447 


1458 


1457 


1463 


19 


2329 


1281 


1300 


1317 


1326 


45 


2521 


1318 


1335 


1322 


1331 


20 


2778 


1351 


1368 


1352 


1360 


46 


3019 


1387 


1402 


1395 


1403 


21 


2400 


1301 


1319 


1341 


1350 


47 


2827 


1366 


1382 


1367 


1375 


22 


3393 


1425 


1438 


1435 


1442 


48 


2780 


1354 


1370 


1374 


13S2 


23 


3489 


1452 


1464 


1457 


1463 


49 


1731 


1164 


1183 


1173 


11S2 


24 


8264 


1533 


1536 


1533 


1534 


50 


3885 


1467 


1478 


1474 


14S0 


25 


3106 


1403 


1418 


1407 


1415 


51 


3856 


1452 


1464 


145S 


1463 


26 


1866 


1187 


1207 


1206 


1216 















Table 3: Average Frequentist's Coverage Probability and Average Length (nominal 
coverage =.95) 





a 2 = 0.5, t 2 = 0.2 


a 2 = 1.0. r 2 = 1.0 


a 2 = 1.0, t 2 = 0.5 




m = 20 


m = 30 


m = 20 


/ii = 30 


m = 20 


m = 30 


Naive 


0.834 


0.845 


0.891 


0.911 


0.S45 


0.862 




(1.39) 


(1.38) 


(2.64) 


(2.68) 


(2.11) 


(2.12) 


Laird-Louis 


0.911 


0.900 


0.924 


0.929 


0.911 


0.905 




(1.66) 


(1.55) 


(2.86) 


(2.81) 


(2.45) 


(2.34) 


Prasad- Rao 


0.937 


0.921 


0.926 


0.929 


0.929 


0.919 




(1.68) 


(1.58) 


(2.83) 


(2.80) 


(2.47) 


(2.35) 


Proposed 


0.946 


0.933 


0.938 


0.939 


0.941 


0.932 




(1.82) 


(1.68) 


(2.99) 


(2.90) 


(2.66) 


(2.49) 



28 



Table 4: Simulated Bayesian Coverage Probabilities for m = 20, a 1 = 1.0, r 2 = 1.0 



No. 


Naive 


Laird 


Prasad 


Pro 






-Louis 


- Rao 


posed 


1 


0.929 


0.939 


0.952 


0.949 


2 


0.936 


0.938 


0.957 


0.947 


3 


0.941 


0.943 


0.961 


0.951 


4 


0.935 


0.940 


0.955 


0.94S 


5 


0.940 


0.944 


0.958 


0.951 


6 


0.938 


0.941 


0.959 


0.951 


7 


0.924 


0.946 


0.945 


0.954 


S 


0.939 


0.941 


0.960 


0.950 


9 


0.936 


0.940 


0.957 


0.951 


10 


0.920 


0.952 


0.915 


0.958 


11 


0.920 


0.953 


0.944 


0.959 


12 


0.939 


0.942 


0.959 


0.952 


13 


0.926 


0.949 


0.950 


0.955 


14 


0.931 


0.946 


0.953 


0.953 


15 


0-867 


0.960 


0.902 


0.965 


16 


0.930 


0.944 


0.950 


0.951 


17 


0.935 


0.937 


0.956 


0.946 


IS 


0.935 


0.946 


0.957 


0.955 


19 


0.938 


0.939 


0.957 


0.948 


20 


0.911 


0.951 


0.937 


0.956 



29 



, Table 5: Simulated Bayesian Coverage Probabilities for m = 30, a 2 = 1.0, r 2 = 1.0 



No. 


Naive 


Laird 


Prasad 


Pro 






*Louis 


. Rao 


posed 


i i 


0 938 


0 945 


0 954 


UiJvl 


2 


0.944 


0 944 


0 960 


0 953 


3 


0.937 


0 938 


o Qsa 


0 Q45 


4 


0 940 


0 942 


0 Q56 


0 Q51 


5 


0 938 


0 940 


0 955 


0 948 


6 


0.943 


0.943 


0 961 


0 953 


7 


0.928 


0 949 


0 948 


0 957 


g 


0 941 


0 426 


0 Q56 


0 Q4Q 


g 


0 936 


0 941 


0 953 


0 Q4Q 


10 


0.925 


0 949 


0 944 


0 956 


U 


0.922 


0.949 


0 940 


0 954 

U» Jut 


12 . 


0.939 


0 941 


0 955 


0 948 


13 


0.935 


0.952 


0.953 


0.956 


14 


0.936 


0.949 


0.955 


0.956 


15 


0.879 


0.954 


0.907 


0.958 


16 


0.937 


0.947 


0.955 


0.955 


17 


0.942 


0.943 


0.950 


0.950 


18 


0.941 


0.948 


0.957 


0.955 


19 


0.944 


0.945 


0.959 


0.952 


20 


0.920 


0.523 


0.943 


0.957 


21 


0.924 


0.951 


0.944 


0.957 


22 


0.942 


0.943 


0.956 


0.950 


23 


0.945 


0.945 


0.959 


0.952 


24 


0.933 


0.945 


0.951 


0.951 


25 


0.924 


0.956 


0.947 


0.961 


26 


0.943 


0.944 


0.958 


0.952 


27 


0.939 


0.943 


0.954 


0.950 


28 


0.929 


0.943 


0.948 


0.952 


29 


0.941 


0.944 


0.957 


0.952 


30 


0.940 


0.942 


0.957 


0.950 



Table 6: Percent Average Relative Biases of MSE Estimators 





<r 2 = 0.5, r 2 = 0.2 


a 2 = 1.0, r 2 = 1.0 


a 2 = 1.0, r 2 = 0.5 




m = 20 m = 30 


m = 20 m = 30 


m = 20 m = 30 


Naive 


-22 -19 -17 -12 -18 -22 


Laird-Louis 


5 —I -4 -4 -3 1 


Prasad-Rao 


4 -2 -7 -6 -4 -1 


Proposed 


23 1*2 4 2 8 16 



Table 7: Average MSE of MSE Estimators (multiply by 10 2 ) 





a 2 = 0.5, r 2 = 0.2 


a 2 = 1.0, t 2 = 1.0 


<j 2 = 1.0, t 2 = 0.5 




m = 20 


m = 30 


m = 20 


m = 30 


m = 20 


m = 30 


Naive 


0.92 


0.74 


3.92 


2.47 


3.06 


4.11 


Laird- Louis 


1.00 


0.70 


3.13 


2.08 


2.76 


3.92 


Prasad-Rao 


0.47 


0.47 


2.25 


1.78 


2.00 


2.04 


Proposed 


0.9S 


0.63 


2.90 


1.S7 


2.46 


3.71 



n 



APPENDIX 

We shall assume the following regularity conditions throughout the paper. The 
regularity conditions will be refered to as (RC). 

Regularity Conditions (RC): 

(a) The elements of Xi and Z { are uniformly bounded such that ££L t J^V-" V)A'; = 
[0(m)) pxp ; i = l,...,m 

(b) sup t>1 n t - < 00 and sup (>1 k < oc: 

(c) li-XiVr^ZiGiim = [0(1)U; 

( d ) A^" l ^)* C «'W) A 'l = [OtHUi for; = L..., 5 

(e) fli(^) = ZUo^iDijD^ and G;(0) = Vj^o^FijF^ where 0 O = I, D 0 - and 
(i = l,...,m,y = 0, ...,5) are known matrices of order m x k { and Av x it,- 

respectively and the elements are uniformly bounded known constants such that 
Ri(il>) and G,(0) (i = l,.. M rn) are all positive definite matrices. In special cases, 
some of Dij and may be null matrices. 

(f) 0 is an estimator of 0 which satisfies (i) 0 - 0 = O p (m" 1/2 ), (ii) 0 - t/v\ /L = 
O p (m~ l ), (iii) 0(-K) = 0(K) and (iv) 0(K + Xb) = for any b e W and 
for all K, where K = col x <i< m Y). X = col x <i< m Xi and is the maximum 
likelihood estimator of 0. Assume that E{xb — 0) = — m~ l B(xh) + o(m~ l ). 

Note that conditions (a)-(f) were also needed in Datta and Lahiri (1997) (also see 
Prasad and Rao 1990). Condition (g) is reasonable following an argument of Cox 
and Reid (19S7). 

Let Rm = Op(m~ l ) and R^ = 0 p .(m~ l ) denote sequences of random 
variables such that mR m and mR* m are bounded in probability under Model 1 and 
Model 2, respectively. 



Theorem A.l. Under Model I and the regularity conditions (RC), 

(i) ), 0-) - d,{ Y i; <a)] 2 = *S) + o P (m- l ) ; 

(u) = + <fe,(0) + ftita H) - g 3i (t) - m- l B'(it>)Vgu(j>) + o p (m-" ). 

Proof of Theorem A.l.: Using an argument similar to the one given in the proof 
of Theorem A.l of Datta and Lahiri (1997), we get 

= Oi(YrJ'{*)j>) + - 4>)' V^/H </-)). 

+CV(m-'), (29) 

where, V6(Yr,0*(j>)h<J>) = {^Or. FM), h ■ ■ ■ , 
Using = L P r=l U^) x 

+ ij^Wifc*) U/H*) and S^/W) = 0 p .(m->) (see Cox and Reid .1987). we 



get 



= ^-h)^Oi[Yi-,^h\ 0 ^ (i) + O p .(m-' 1 ). (30) 



Now, 



= [n 0 + AJ G,(0) Z; K- l (^) (*S - Xifi)] } 

= 4: {a: ; <?irf) 2? »r l w} [*s - xM) + .w) - x^)] 

■ = - XM)) + 0 P Am-") (31) 

where Ifctf) = ^ {a;. 2? 1--'(^)} and /*'(>) - 4(0) = O p .(m"5). Using 

(29) - (31), 

»i(yS;^(f).f) 



33 

= to;^)W0+|D^-^)^)|(i1 - xM)) + O p .(m- 1 ) 
= 0) + (V> - j>YLM>Wi - -W)) + Orim~ l ) y (32) 

where 1,(0) = co/i<j<,£y. 

Using (32), E. 0*{j>) = and £. (t/>' - 0) = 0 p (m- 1 ), we get 
£. to; 

= EJi(Y r ,P*(t),it>) + O p (m-') 

= & {/; r(0) + a;, g.-m vr'w o< - + o p (m- 1 ) 

= /</5(0) + ^ G,(J.) 2T/ V;-'(0) (VJ - + O p (m- 1 ) 

= to; + O p [m- 1 ). (33) 

Now using (33), we get 

icfto^oh,**)] 
= ^[to.-jW).*-) - £.{0,(>;;W),0-) }] 2 

= & [to; £*(*").**) - te;/5(</.),0) + O^m" 1 )] 2 
= £*[to;/h^),<h - ^,( V;-;. /}(0), 0) ] 2 + O p (m- 2 ) 

+O p (m-y,[^(V;;.hf)ii') - to; 
= £*p,(K;^"(0"),0") - di(YrJ^)J) J" + O p (m" 2 ) 

= &{to;jW*),r) - to;/H*),^) 

+O i (Y i ;frfat>) - WrJti').?)} 2 + O p (m- 2 ) 
= £*Q 2 + + 2£*0,<? 2 + O p (m~ 2 ), (34) 

since [to; ^(0-), 0-) - to; ,.)(0),0) ] =O p (m~ 1 ), by (33), and using the 
notation Q, = to;W).*) - to;/*(V'M') and Q 2 = to?^'*), **) 
-fe;W),0)- Note that 

Qi = OilYrJ'id'hh - to: 3(*)J) 



31 



= /; W) + a; aw z\ vr l (i) (v; - x^)) 
-tiki) - K Gdi) z\ vr l W [Yi - xM)) . 
= Uptf) - hH + KGiifaziv-^x^)-^))} 

= { /,• - xwr\*) Zi gS) a.}' [rw - m] &) 

and using (32) we get 

Q2 = kYi-J'm-r) - to W),*) 

= (tf'-^WHtf - A' l .J(0)) + O p .(m- 1 ) (36) 

Thus, using (35) 

B. Ql = { - XiVr'it) Zi (7,(0) A,}' £ # [^(0) - i§(0)] 
[/W) - - XlV- l W Zi G.(0) A,} 

= { 'i - A';V;- l (^) Z, Gi(h A,}' Kar. (/j*(0)J 
{ /, - X\Vr l [v) Zi <?,■(*) A,} 

= { /, - x\vr l {h Zi d(h a,}' ff; wcte] ' 

{ u ~ XWH*) Zi GiW A,} 
= 92i(h (37) 

Note that in the above Var* [pfy)] = [£«, .V;Vr l (0)A',] _1 since Kar[/5(u«)] = 
[ESi^VT^W]" 1 . Using (36) 

&Ql = £;.[(r-^.(0)(V;-A' I ^(^)) + Op*(m- , )j 2 

= - M'lmw> - xM)m - xM))'L'imr - *)] 

+o p (m-') 

= £. Zrace [(0* - 0)'£,(0)(V; - .V {i 5(^))(J5 - XM))' 
£!(*)(«•'-*)] +0p('"~') 



= trace [ £,(0)(V} - XM))(Y : - XM))'L'^) 

EAr-M>"-tl>Y}+o P (rn- t ) 
= trace [lSKY - Xt'^Wi - XtM)? L' t {j>M1>)\ + c^m" 1 ) 
= 9«{hYi) + o P (m- 1 ). (3S) 

Note that in the above Var„(0') = S(V') since Var{ij)) = S(0). This proves 
part (i). Similarly, using (35) - (36) 

E+QxQz = [/,- - X\Vr x {*)Zi GM) E.{(0*(4>) -4(0)) 
(0* - 0)'}L,(0)(V; - A',- 0(0)) + o p (m" 1 ) 
= Op(m~ l ), (39) 

Note that in the above E. [(.tf-(0) - £(0)) (0- - 0)'] = OpCm" 1 ) by (g) of 
(RC). Now using (34), and (37) - (39), we get 

= 92iW + *,-(*$;*) + o p (m" 1 ). (40) 

Using Theorem A.2 of Datta and Lahiri (1997). we get 

EAguW*)) = §uW - m"' B'WVgui*) - g 3i (j,) + 0p ( m - 1 ). (-1 1 ) 
Now, part (ii) follows from (40) and (41). 

Theorem A.2. Under Model 1 and the regularity conditions (RC), we have 

(i) E[Vn = r(8f B ) + o(m-*), 

(ii) E[V^ L ) = r(0f B ) - 2m- l Z?V)V< 7li (0) - ff3 ,(0) + 0 ( m -«). 

Proof : Since 0 - 0 = o,(l), and S(0) = Ofm -1 ) we have £,(0) = £,(0) 
+ o p (l), 4(0) = /? + o p (l) and S(0) = S(0) +o p (m- 1 ). Thus, 

Li(4') [(Yi - XiM<))(Yi - XM)Y] 4(0) S,(0) 
= LiWHYi-XiaHYi-XiWm*) Si(ij,)+o p (m- 1 ). (42) 



Using (42) and the expressions for g&(&) and sr 4 ,(<f»; Yi), we get 

E[g4i(Yi\ 0)1 = 9xM + o(m- 1 ). (43) 

Using (13) , (43), rrr l B'(j,)Vg u (ii>) = m- l B , (^)V^ t (0)o p (m- 1 ) (which follows 
from 0 - 0 = o p (l)) , j 3 .-(V ? ) — i/3«(r) + o(m" 1 ) and the expression for part (i) 
of the theorem follows. Now, using part (ii) of Theorem A.l, (13) , (43) and part 
(i) of this theorem then part (ii) of the theorem follows. 



CHAPTER 3 



Empirical Bayes Estimation of Small Area Characteristics 
Under Random Sampling Variances 

3.1 Introduction 

As explained in Chapter 2, Model i covers many small area models considered 
in the literature. The model assumes a random small area effect through the 
prior distribution on C/^s, the location parameters. However, random small area 
effects have not been introduced in the scale parameters. This will result in either 
very unstable or oversmoothed estimates of the small area variances. To illustrate 
the point, let 0 a - and af be the true mean and true variance of the tt\\ small 
area respectively (i = l,.. M »i). Let be the jth observation in the ith small- 
area (i = in; j = l,...,n v -).' Conditional on 0,- and <r?, let ,j/,- ft| be iid 
N(0i % of), i = 1, m. Suppose that the primary parameter of interest is of (i = 
1, ...,m). The assumption that af [i - 1, ...,rn) are different and fixed parameters 
will lead to a direct estimator which utilizes the information contained in the 
observations from the ith small-area alone (e.g.. Sf = (n,- - l)~ l ~ F,) 2 - 

the sample variance of the /th small-area). On the other hand the strong synthetic- 
assumption af = a 2 (i = 1, m) will lead to a oversmooth estimator of af which 
will use the sample from all areas, e.g., S' 2 = Tlhufai — l)Sf/( n — ™) (where 
n = 53^2 n t ), the pooled sample variance. 

The above discussions motivate us to consider a prior distribution for af (i — 
l,...,m). Kleffe and Rao (1992) used such a modeling of sampling variances 
(i.e., af) and observed that the formula for the EBLUP of 0 X remains unchanged 
when compared to the formula given in Prasad and Rao (1990) which assumed 
af = a 2 (i = l,...,m). However, the assumption of random af inflates the mean 
squared error of EBLUP. Arora and Lahiri (1997) demonstrated that in such a sit- 



uation hierarchical Bayes method provides estimates of 0; which Is superior to the 
corresponding EBLUP. In section 3.2. we state the Bayesian small-area model and 
discuss the Bayesian estimation of 'A(of), a real valued function of of. In section 
3.3, we obtain the Bayes estimator of Oi under squared error loss. Replacing prior 
parameters in the Bayes estimator by their estimates, empirical Bayes estimation 
of b(af) and 0, are considered in section 3.4. A second order Laplace approxi- 
mation method is proposed in section 3.5 to replace the one-dimensional integrals 
appearing in the Bayes and EB estimates of b(af ) and 0,-. Finally, in section 3.6. 
we present results from a simulation. 

3.2 The Bayes Estimation of 6(of ), a Real Valued Function 
of a? 

Let yij denote the value of a characteristic of interest for the jth unit of the 

ith area (i = l t . . . ,rn: J = 1 . «;). Let yi denote the vector of all values from 

the ith small-area i.e., m = {y i{j ...,# in| ). We shall consider the following model: 

MODEL 3 

(i) Conditional on 0,* and of, y,/s are independent with 
Vi: I 0«, <rf ~ iV(0,-, <7?) r (i = 1 ? . . . , m; j = 1, . . . , m); 

(ii) 9i^N{x%T% (z = l....,m); 

(iii) (/ = L...,m) 

where the the density of inverse gamma (IG) is given by /(of) = {(77 - 1)£}" 
(l/afJ^c-^-^^/H;/), erf > 0. Note that E(af) = f and uar(a?) = (V(;, - 
2), 77 > 2. In the above model, we assume that xi is a p x 1 vector of known and 
fixed auxiliary variables and 0 is p x 1 unknown fixed vector. 

To get the posterior density of af | where # = (/?, r 2 , we first write 



3U 



down the joint density of y,-,0».of: 

= .W)-*.-**^^**^/^^ ,44, 
Consider the exponent of (44), i.e.. 



= 3E(«i-Fi + Fi-ft) , + 3(ft-«^) a 

a i ;=1 r 

= ^g(W-Fl) , + ^(F|-•l) , + ^l-«^)^ (45) 
where = 1 Eyii 1/0- Now the second and third term of rhs of (45) is 

+ Of + 1* { 0? + T* ] ( <7? + T*': (46) 

Thus, using (46), (45) equates to 

Replacing (47) into the exponent of (44) and then integrating out 0,. we get the 
joint p.d.f of and <r? as: 

/(y,,<r?) « + 3)"* 



Of T 

'2 I 



IG{U,, -[)£}. (AS) 



40 



Now, let's consider second and third term in the exponent of (48) 

= (^+^){^(v; 2 )-[£(^] 2 } 

= (^ + ^)£[K-£(K)] 2 , (49) 

where the random variable V x is defined as 



[ Z<jJ W.p 0; = I - />;. 



(50) 



Now 



+ [^-^,-(l-p.W/?] 2 (l- Pi ) 

= [(1 - Pi)Vi - (1 - P.)x^] 2 ft + l-piiy, - x'M 2 ^ - Pi) 

= P.(l - P.) 2 (I/, - ^) 2 + P?d - P.KF; ~ ^) 2 

= W(l-ff)(?i-«J/?) a (51) 

Thus, using (51) and some algebra, (49) can be written as 



Now, using (48) and (52) the joint distribution of y t - and of can be written as: 



11 

Finally, the posterior distribution of a? is given by 

(54) 

where a? = E^ifotf - Fi) a - 

Hence, the Bayes estimator of b{a'f ) under squared error loss is given by 

6 fl (of) = £[fi(<7?) | y : ; tf] = r 6(a?)/(a? | yr, +)dof. (55) 

JO 

The measureof uncertainty of 6 s (er?) is measured by Var[b{<rJ) \ y { ; ip] = E[{b(<rf)} 2 1 
*;*]-{*[«>?) I *;*]}'• 

3.3 The Bayes Estimation of 0 t 

Using Model 3, we first find the posterior distribution of given y,-. Now. the 
joint distribution of y,. 0,- given is given by 

= ^-ii*^.^^^ (56) 

where c is a constant which does not depend on 0,-. Consider the exponent of (06) 
i.e., 

^ 1 1"= 1 ' 



a,- r 2 H 



12 



Hence, the distribution of 0, given y,- and of is normal with mean jj^rps Vi + 
;?^:c</? and variance is (£ + £)"'. 

Thus, when V = (j9,r 2 , 7},£) is known, the Bayes estimator of 0,- is given by 

Of = 1 Jfi.'^l = E{E[0i | y,-.«r?;^] | y,;#} 

= £{[(1 + flatly,;*} 

= (l-£(ft|^l)Fi + £[ft| W ;^ . 

= (1 - Wf)F f + (58) 

where u>,- = £[5.- | y,-; *] = / 0 ~ ^rf^ftf | y,-; 0)^?, A = cr, 2 /(* 2 + n,r 2 ). 
The measure of uncertainty of Of is given by 

Var[$i | y,;tf] = ^{Varp, | y;,<7 2 ;tf] | y ; ;tf.} + Var | y,,of;tfl I «•} 
= E { c fl T ,l T 2 I *} + " BifSi + I *} 

= r 2 Wi + (y, - x' i( )fVar[Bi | y ; : 0]. (59) 

To calculate u>, let us make transformation B, = of/(<r? + n,r 2 ) implies <t? = 
n i T2 TZ^ and fa = "^(l - #;)~ 2 - Note that the range of B { is between 0 and 1. 
Using (54) then the probability distribution of B; given y,- is 



CX J M-Bi) _ I^ft _ i _ ( B -lK(l-fl,) l 

eiP l 2n,r»ft 2r* { ' Ji ^B { j' (60) 



3.4 Empirical Bayes Estimation 



In practice, the Bayes estimators Of and l B (af ) in (55) and (58) involve several 
unknown parameters c- = { : 3. T 2 . //.£), then we need to estimate them from the 



available data. Arora (1994) proposed ANOVA type estimator for each unknown 

parameter. We will use his estimator to estimate iLk The estimators of /?, ;/ and 
r 2 are given by 

v = [iXtf - »r l A.i,inx.] ' [£>;.(/,• - nr^.i,i:-)y.] (6D 

| = (n T - m - p)~ 1 f; £[1 - xfai4r l Xi]vi. (62) 
t=i 

n = 2+?/6* (63) 



(64) 



where A,- = n,T 2 /(f + n,r 2 ), (i = l,...m), n T = E£i = {E£,(n 2 - 
I)} -1 ££i - Xi(xri)- l Xi] Vi f-p. and n. = » T - t race [(.V'.Y)" 1 ££, n?^] . 
In practice, f 2 might be negative, thus an estimator of r 2 is r 2 = maj(0, f 2 ). 

An empirical Bayes estimator of Of 8 and b EB {af) are now obtained from Of 
and 6 s (of ), respectivel}' when ti' is replaced by r and are given by 

tff fl = (l-u> l )y, + l i. 1 x^ (65) 



and 



6 £8 (a?) = E[btf) | w; 0] = f HofWof | y,-; *)*?. (66) 

JO 



3.5 Second Order Approximation 

The empirical Bayes estimator Of 8 and b EB {af) given in Section 3.4 calculated 
by numerical integration. We know that posterior probability density is propor- 
tional to a likelihood function L and a prior density tt. Then the expectation we 
want to evaluate is in the form 

mO)\y}= smmdB (60 



44 



where b{Q) is a real function of 0 (0 is a parameter vector having a posterior density 
based on a sample n observations). The numerator and denominator of integrals 
(67) can be evaluated by asymptotic approximation using Laplace's method. 

Lemma I. If A is a smooth function which has first-five derivatives of an m- 
dimensional vector 9 having a minimum at 0 , and 6 has first-three derivatives is 
some other smooth function of 0 then, under suitable regularity conditions, 

E[b{0)) = b(9) + {b"[0)[h u [d)]~ 1 - b\9)h'»(0)[h"(0)]- 2 } (68) 

see Tierney et al. (19S9). 

Kass and Steffey (19S9) pointed out that transforming erf to of = exp{— /»,-) to 
help emphasize that p, is generally preferable to of in numerical work. By using 
transformation above, we will get the posterior probability density of p t given y-, 

f(Pi I Vi) = c exp{ Pi (^-^ + 9)}(/i,t 2 + exp( -p.)) - ' 

e , 0 /__fL "« f - /^a (7-i)^ V 

P \ 2exp(- Pi ) 2(n,-r* + e*p( -/»,)) ^ ''' ' exp( -/,,)/ 

(69) 

To calculate to,- i.e., 6(p,) = ni /ffiirt-<>,) ' we need ^ a ^ so we should 
have k'(pi),h"{pi) and n'"(p,). where h{p { ) = -n," 1 log /(/>,• | Taking log oil 
both side of (69), we get 

log f(Pi I !/;) = (^5— + r]) Pi - ^ log(n,r 2 + cx P {- Pi )) 

__f± >2i /- _ /mj _ (V ~ \)j , , 

2e*p(-p,) 2(ii,t* + «*(-/»,•)) iP; ezp(-p,) " M ° gC - 

(70) 

The first, second and third derivatives of loglikelihood are 

g log /(/>■• ly.) _ ( n,-l e.rp( -/>,-) of 

% 1 2 " 2(n.-r» + e«p(-A)) 2exp(-p f ) 

"/(ft -4fl a egp( -/»,-) 

2(»,r* + C .rp(-p ; ))' exp(-Pi)" 1 ' 



45 



d 2 log /(/>■ | y>) 
dpi 



2(r»,r 2 + exp(-pi)) 2(n,r 2 + exp(-p,)) 2 



cxp(-Pi) . exp(-2pj) 



2exp(-pi) 2(»,r 2 + exp(-p,)) 2 
".•(y,-x^) 2 exp(-2p,) fo-lft 
(n,r 2 + exp(-pi)) 3 exp(-pi) ' 



(72) 



dpi 



2(n,r 2 + exp(-p,)) 2(/i,r 2 + exp(-p,)) 2 



cxp(-p,) 3exp(-2p t ) 



exp(-3p t ) 



(n, r 2 + exp( -p, ) ) 3 2exp( -p,- ) 
ndVi - x'iP) 3 exp{-pi) 3n,(y t -x;/?) 2 exp(-2p, ) 
2(h,t 2 + exp(-p,)) 2 (n,T 2 + exp(-p,)) 3 
3».(y,-i-:/g) 2 exp(-:3p,) (17 -IK 



(73) 



(n.H + expt-p,))-' exp(-p,)' 



respectively. Noting that 6'(p ; ) = - j^=^ + (^?S&)F and *"(«) = 



An iterative routine is used to get the mode of erf or /5,\ The first and second 
derivatives V/ (m) and H of function /(/>, \ arc used. In order to solve /'(/?, | 
Hi) = 0, we employ Newton's iterative procedure. 

3.6 Numerical Example 

In this section, using a numerical example we will (a) check the accuracy of the 
Laplace approximation, (b) evaluate the Laird- Louis method of measuring uncer- 
tainty of the proposed empirical Bayes estimator and (c) compare the performances 
of the empirical Bayes estimators of 0,- and of. 

We generated m = 30 independents? (i = l....,30) from IG{rj,{-q- and 
Qi (i = 1, 30) from r 2 ). We took = 10 and r 2 = 1 and considered various 
combinations of (77,^). Finally, wc generated tj tj '~ N{0 iy af) (i = I,. ..30: j = 
l,...,n, = 10). 

In Table 8, we present the empirical Bayes estimates of Of using numerical 
integration and Laplace methods. Clearly second order Laplace approximation is 




46 



very close to integration, agreeing upto four decimal places most of the time. In 
Table 9, we tabulate the empirical Bayes estimates of EB , using both numerical 
integration and Laplace method. 

To capture the additional variabilities, Laird and Louis (19S7) proposed para- 
metric bootstrap method. Following the Laird- Louis method, we generated R = 
1,000 the bootstrap samples as follows. For each r = 1, •..,/?. we generated 0J r , 
where i = l,...,m, independently from normal with mean fx and variance f 2 and 
iid of* from inverse gamma with mean f and variance £ 2 /(r) -* 2). We then gener- 
ated independent bootstrap samples yjj from iV(0' rJ of*. One can simplify the 
generation by generating y£ W N{0j r ,of r m /n,) and Sf r ~ ofcxlt-v Equation (10) 
of Laird and Louis (19S7) suggests the following measure of variability of ef B : 

R R 

Varf* = R~ l £ Var?(y r , fc) + <* - l)" 1 £{*f fc) " tffo)} 1 . (74) 

where ef (y,-) = anc l ifc is a » estimate of 0 based on the rth 

bootstrap sample. The Laird-Louis bootstrap measures of uncertainty are report 
in Table 8. We point out that for 9 of 30 small areas, the Laird-Louis measure is 
smaller than the Naive measure. 

In Table IQ-Table 24 , we present the Average Absolute Bias i.e., AAB = 
\Ti - e;|, where 2J is 0 { or of and e, is y.yJf B ,S\Sf,of E * Average 
Squared Deviation (ASD) = £££,(7} - e,-) 2 , Average Relative Bias (ARB) = 
EkSil an d Average Relative Squared -Deviation (ARSD) = 

It turns out that Of 8 is uniformly better than y { and y with respect to all 
the four measures of evaluation. However, the performance of of BB depends very 
much on the value of tj. Note that for small //, Var(of) is large which will support 
5? and for large 77, Var(of) is small which will support S 2 . Our numerical results 
agree with the above observation. The empirical Bayes estimator of EB performs 
well unless tj is very small. 



Table 8: Empirical Bayes Estimate Of B of 0; using Numerical Integration 
Laplace methods for 7 = 10 and f = 4 









ef B 


Naivt{6f B ) 


MSE(6f B ) 


No. 


Vi 


0* 


Integr. 


Laplace 


Integr. 


1 Lao lace 


Bootstrap 


! 1 


10.52 


9.98 


10.3705 


10 3706 

Av'V 1 \J\J 


0.2425 


0 2422 


0.273S 


2 


10.28 


9.19 


10.1352 


10.1351 


0.4289 


0.4291 


0.2734 


3 


8.61 


10.37 


9.1303 


9.1305 


0.3564 


0 3558 


0.2981 


4 


9.76 


9.62 


9.8292 


9.8292 


0.3097 


0 3097 


0.2717 


5 


10.08 


9.79 


10.0434 


10.0434 


0.3005 


0 3005 


0.2731 


6 


10.46 


10.26 


10.3361 


10.3362 


0.2287 


0.2284 


0.2730 


7 


9.32 


9.76 


9.4991 


9.4990 


0.2544 


0.2541 


0.2742 


8 


9.31 


9.71 


9.4755 


9.4754 


0 2299 


0 2296 


0.2777 


9 


10.36 


9.58 


10.2156 


10.2155 


0.3314 


0 3314 

V» WU ATE 


0.2720 


10 


10.57 


10.13 


10.4239 


10.4240 


0.2272 


0 2269 


0.2770 


11 


9.92 


9.99 


9.9290 


9.9290 


0.2158 


0.2156 


0.2716 


12 


10.51 


10.35 


10.3609 


10.3610 


0.2486 


0.2483 


0.2728 


13 


8.82 


9.66 


9.1299 


9.1297 


0.2491 


0.2483 


0.2908 


14 


7.37 


8.70 


8.4212 


8.4220 


0.4074 


0.4052 


0.3745 


15 


9.11 


8.94 


9.3828 


9.3828 


0.2960 


0.2957 


0.2842 


16 


9.79 


10.91 


9.8413 


9.8412 


0.2536 


0.2535 


0.2738 


17 


8.98 


8.68 


9.2717 


9.2716 


0.2711 


0.2705 


0.2824 


18 


10.74 


10.74 


10.4304 


10.4303 


0.3702 


0.3701 


0.2780 


19 


11.84 


11.14 


11.3218 


11.3220 


0.2663 


0.2641 


0.3228 


20 


11.26 


10.92 


10.9318 


10.9320 


0.2374 


0.2363 


0.2935 


21 


8.06 


8.69 


8.5423 


8.5420 


0.2422 


0.2398 


0.3260 


22 


10.07 


9.35 


10.0365 


10.0365 


0.2859 


0.2859 


0.2703 


23 


10.24 


9.81 


10.1649 


10.1649 


0.2609 


0.2608 


0.2680 


24 


10.06 


10.70 


10.0254 


10.0254 


0.3234 


0.3235 


0.2716 


25 


10.33 


10.55 


10.2447 


10.2448 


0.2177 


0.2174 


0.2739 


26 


9.51 


11.13 


9.6793 


9.6794 


0.3440 


0.3440 


0.2780 


27 


10.06 


9.96 


10.0338 


10.0338 


0.2401 


0.2400 


0.2703 


28 


8.61 


8.86 


8.9738 


8.9736 


0.2524 


0.2512 


0.2977 


29 


13.44 


12.67 


12.5112 


12.5115 


0.2958 


0.2861 


0.4468 


30 


11.07 


10.83 


10.7776 


10.7778 


0.2460 


0.2452 


0.2912 



48 



Table 9: Empirical Bayes Estimate of BB of of using Numerical Integration and 
Laplace methods for 77 = 10 and f = 4 









of*" 


Lapl 


ace 


No. 




Sf 


Laplace 


Integration 


Naive MSE | 


Boots MSE 


1 


3.737 


2.433 


3.348 


3.374 


0.767 


3.234 


2 


6,376 


14.182 


8.400 


8.463 


4.779 


3.622 


3 


13.185 


7.797 


5.782 


5.827 


2.316 


3.339 


4 


6.335 


5.839 


4.811 


4.848 


1.575 


3.123 


5 


2.756 


5.350 


4.598 


4.633 


1.439 


3.557 


6 


4.541 


1.852 


3.093 


3.117 


0.655 


4.188 


7 


3.693 


2.937 


3.572 


3.599 


0.874 


5.662 


8 


3.146 


1.866 


3.105 


3.130 


0.661 


3.364 


9 


5.360 


7.050 


5.340 


5.380 


1.941 


3.473 


10 


2.683 


1.766 


3.060 


3.084 


0.642 


3.410 


11 


2.040 


1.364 


2.874 


2.896 


0.565 


3.224 


12 


4.471 


2.707 


3.467 


3.493 


0.822 


2.904 


13 


3.305 


2.481 


3.405 


3.432 


0.802 


3.038 






O.OOO 


6.427 


6.498 




O.40D 


15 


3.265 


4.850 


4.420 


4.454 


1.341 


3.152 


16 


8.775 


3.003 


3.586 


3.613 


0.877 


4.310 


17 


4.163 


3.557 


3.865 


3.896 


1.029 


2.875 


18 


5.436 


9.313 


6.350 


6.399 


2.754 


3.285 


19 


2.960 


2.656 


3.568 


3.598 


0.899 


3.547 


20 


2.424 


1.908 


3.164 


3.189 


0.694 


3.034 


21 


2.258 


1.675 


3.124 


3.150 


0.690 


3.997 


22 


2.989 


4.580 


4.266 


4.299 


1.240 


3.094 


23 


5.025 


3.331 


3.729 


3.758 


0.949 


2.844 


24 


6.268 


6.639 


5.154 


5.193 


1.806 


3.356 


25 


1.724 


1.417 


2.901 


2.923 


0.576 


3.203 


26 


5.535 


7.796 


5.666 


5.709 


2.185 


3.140 


27 


2.763 


2.399 


3.323 


3.349 


0.754 


2.819 


28 


5.481 


2.488 


3.428 


3.456 


0.817 


2.976 


29 


3.207 


1.585 


3.400 


3.436 


0.889 


3.202 


30 


3.637 


2.369 


3.352 


3.379 


0.776 


3.766 



49 



Table 10: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 0, and af 
where 77 = 5 and A = £/r 2 = 0.25 







of 




y Vi 0? B 


S 2 Sf &J EB 


AAB 


0.6S6 0.132 0.126 0.123 0.099 0.094 


ASD 


0.767 0.026 0.025 0.024 0.015 0.018 


ARB 


0.06S 0.013 0.013 0.504 0.378 0.356 


ARSD 


0.007 0.000 0.000 0.413 0.1S6 0.230 



Table 11: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 0{ and of 
where 7 = 5 and A = £/r 2 = 0.50 





0i 


of 




V Vi of B 


S* Sf of EB 


AAB 


0.6S7 0.188 0.171 0.245 0.197 0.1S7 


ASD 


0.767 0.052 0.050 0.097 0.061 0.069 


ARB 


0.069 0.019 0.017 0.504 0.37S 0.355 


ARSD 


0.007 0.000 0.000 0.413 0.186 0.230 



Table 12: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 0; and af 
where 7 = 5 and A = <f/r 2 = 1.00 





Oi 






V Vi of B 


S 2 Sf af EB 


AAB 


0.687 0.265 0.234 0.490 0.394 0.374 


ASD 


0.767 0.104 0.095 0.389 0.246 0.277 


ARB 


0.069 0.027 0.024 0.504 0.378 0.355 


ARSD 


0.007 0.001 0.001 0.413 0.186 0.229 



Table 13: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 0, and <rf 
where 77 = 5 and A = f /r 2 = 2.00 





0i 






V Vi b? B 


S 7 5 ? £.2 EB 


AAB 


0.6SS 0.375 0.319 0.9S0 0.789 0.747 


ASD 


0.767 0.207 0.173 1.555 0.983 1.101 


ARB 


0.069 0.038 0.032 0.504 0.378 0.354 


ARSD 


0.007 0.002 0.001 0.413 0.186 0.22S 



51 



Table 14: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 0; and af 
where tj = 5 and A = £/r 2 = 4.00 





Oi 


of 




V Vi of* 


s 2 sf af EB 


AAB 


0.6S9 0.531 0.419 1.960 1.577 1.493 


ASD 


0.767 0.415 0.297 6.221 3.931 4.3S5 


ARB 


0.069 0.054 0.042 0.504 0.378 0.354 


ARSD 


0.007 0.004 0.003 0.413 0.1S6 0.226 



Table 15: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0{ and a] where 
7 = 10 and A = £/r 2 = 0.25 





0i 


a? 




1 V Vi of B 


S* Sf Sf EB 


AAB 


0.683 0.140 0.136 0.104 0.114 0.076 


ASD 


0.767 0.032 0.031 0.021 0.025 0.015 


ARB 


0.06S 0.014 0.014 0.398 0.386 0.255 


ARSD 


0.007 0.000 0.000 0.250 0.214 0.010 



Table 16: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0,- and of where 
?7 = 10 and A = f/r 2 = 0.50 





0, 


°1 




V Vi of* 


S 2 Sf af EB 


AAB 


0.682 0.197 O.ISS 0.209 0.228 0.151 


ASD 


0.767 0.064 0.060 0.086 0.101 0.059 


ARB 


0.068 0.020 0.019 0.39S 0.3S6 0.254 


ARSD 


0.007 0.001 0.001 0.250 0.214 0.010 



Table 17: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0,- and erf where 
7 = 10 and A = f/r 2 = LOO 





0i 


of 




V Vi of* 


S 2 S J & J EB 


AAB 


0.680 0.279 0.251 0.118 0.457 0.301 


ASD 


0.767 0.127 0.109 0.343 0.402 0.234 


ARB 


0.068 0.028 0.025 0.398 0.3S6 0.252 


ARSD 


0.007 0.00 1 0.001 0.250 0.214 0.010 



Table 18: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0{ and af where 
ri = 10 and A = cf/r 2 = 2.00 





Oi 


of 




V Vi of* 


S 2 Sf af EB 


AAB 


0.679 0.395 0.322 0.836 0.914 0.599 


ASD 


0.769 0.254 0.187 1.370 1.60S 0.934 


ARB 


0.067 0.039 0.032 0.398 0.386 0.251 


ARSD 


0.007 0.003 0.002 0.250 0.214 0.010 



Table 19: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0,- and af where 
x] = 10 and A = cf/r 2 = 4.00 





0i 






V V, of 3 


S 2 Sf of EB 


AAB 


0.679 0.558 0.402 1.671 1.829 1.197 


ASD 


0.771 0.509 0.293 5.481 6.432 3.703 


ARB 


0.067 0.056 0.040 0.398 0.386 0.252 


ARSD 


0.003 0.005 0.003 0.250 0.214 0.010 



Table 20: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 9{ and a\ where 
rj = 100 and A = f/r 2 = 0.25 





Oi 






V Hi 0f B 




AAB 


0.6S4 0.134 0.132 0.030 0.095 0.028 


ASD 


0.766 0.025 0.025 0.001- 0.015 0.001 


ARB 


0.06S 0.013 0.013 0.117 0.364 0.110 


ARSD 


0.007 0.000 0.000 0.020 0.202 0.017 



Table 21: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of difFerent estimates 0 t - and af where 
rj — 100 and A = £/r 2 = 0.50 





0i 






V Vi of B 


S* Sf b] EB 


AAB 


0.6S4 0.189 0.1S4 0.059 0.191 0.056 


ASD 


0.766 0.051 0.049 0.005 0.060 0.004 


ARB 


0.068 0.019 0.01S 0.117 0.364 0.110 


ARSD 


0.007 0.000 0.000 0.020 0.202 0.017 



Table 22: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0,- and <j? where 
7? = 100 and A = f/r 2 = 1.00 





0i 


°1 




V Vi 6f B 


S 1 Sf of EB 


AAB 


0.6S4 0.268 0.255 O.'IS 0.381 0.112 


ASD 


0.767 0.102 0.094 0.020 0.239 0.017 


ARB 


0.068 0.027 0.025 0.117 0.364 0.110 


ARSD 


0.007 0.001 0.001 0.020 0.202 0.017 



Table 23: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of difFerent estimates 0 t - and <rf where 
77 = 100 and A = f/r 2 = 2.00 







°1 




y Vi of B 


S l Sf of EB 


AAB 


0.684 0.37S 0.340 0.238 0.762 0.223 


ASD 


0.767 0.203 0.170 0.079 0.956 0.068 


ARB 


0.068 0.038 0.034 0.117 0.364 0.110 


ARSD 


0.007 0.002 0.002 0.020 0.202 0.017 



•56 



Table 24: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates 0,- and of where 
7 = 100 and A = £/r 2 = 4.00 





Oi 


*1 




V Vi of 3 


s 2 sf af EB 


AAB 


0.683 0.535 0.434 0.476 1.524 0.444 


ASD 


0.767 0.407 0.2S8 0.316 3.826 0.272 


ARB 


0.068 0.053 0.043 0.117 0.364 0.110 


ARSD 


0.007 0.004 0.003 0.020 0.202 0.017 



57 



CHAPTER 4 

Empirical Bayes Estimation of Finite Population Means 

4.1 Introduction 

Suppose a finite population consists of m strata and the i stratum has :V, 
(known) units. We are interested in estimating certain characteristics for each 
stratum as well as the entire population. Suppose y be a characteristics of interest 
and j/ij denote the value of the characteristic for the jth element of the ith stratum 
(z = 1, ...,m:j = 1, Ni). In this chapter, we consider estimation of the mean of 
each stratum, i.e., 7; s JVf 1 Vij [i = 1, .~,m). 

The primary source of data is usually available from sample surveys. Suppose a 
stratified random sample is drawn from the population. Without loss of generality, 
suppose r/i — (j/a,...,yinj' denotes the random sample of size n,- drawn from the 
ith stratum (i = i,...,m). The traditional design unbiased estimator of 7/ is the 
sample mean given by y t - = Uij (* = 1 ™)- 

Ericson (1969a) put forward an elegant formulation of the subjective Bayes 
approach to the finite population sampling. In his approach, he first assumed 
that the finite population is a realization from a hypothetical population which is 
the usual assumption in the super-population approach in finite population theory 
(see Royall 1970). At the second stage, Ericson ( 1969a) assumed a subjective prior 
distribution on the parameters of the super-population model. In practice, it is 
generally difficult to apply Ericson's Bayesian method since the prior parameters 
are hardly known. Ghosh and Meeden (19S6) considered an empirical Bayes ap- 
proach under a stratified simple random sampling, using an one-way random effects 
model. They successfully demonstrated that their method can be very effective 
in repeated surveys and small-area estimation. Their empirical Bayes estimator 
is asymptotically optimal in the sense of Robbins (1955). Later on Ghosh and 



Lahiri (1987) relaxed the normality assumption of Ghosh and Meeden (19S6) and 
showed that Ghosh-Meeden estimator is robust under the assumption of posterior 
linearity (see Ericson 1969 b; Goldstein 1975; Hartigan 1969). The Ghosh-Meeden 
empirical Bayes estimator can also be motivated from a best linear unbiased predic- 
tion approach of Prasad and Rao (1990). Nandram and Sedransk (1993) extended 
the Ghosh-Meeden estimator under different but random sampling variances. Re* 
cently, Arora et ai (1997) considered an alternative to the Nandram-Sedransk 
method. Their method can incorporate relevant auxiliary information which mav 
be available from various administrative records and censuses. They also proposed, 
for the first time, a measure of uncertainty of the empirical Bayes estimator of fi- 
nite population means which can incorporate uncertainty due to estimation of all 
the parameters in the Bayesian model. Their method is an extension of the para- 
metric bootstrap method proposed earlier by Laird and Louis (1987) to the finite 
population sampling. 

This chapter is a follow up of the method proposed by Arora et ai ( 1997). 
It is to be noted that the method proposed by Arora et ai (1997) involves one- 
dimensional numerical integration. Although it is not a big problem in calculating 
the empirical Bayes point estimates, it poses serious problem in finding the mea- 
sure of uncertainty of the empirical Bayes estimates since several one-dimensional 
integrals must be calculated at each step of the bootstrap simulation and the ac- 
curacy of the numerical integration is hard to check at each step of the Monte 
Carlo simulation. In order to overcome this difficulty, we propose in this chapter a 
suitable approximation, using Laplace's second order approximation (see Tierney 
et ai 1989), to the estimation procedure introduced by Arora et ai (1997). 

In section 4.2. we consider the Bayes and empirical Bayes estimator of the 
finite population mean 7,- (i = 1, ... % m). In section 4.3, we propose a measure of 
uncertainty of the Bayes and the empirical Bayes estimators proposed in section 
4.2. 



59 



4.2 The Bayes and Empirical Bayes Estimation of 7, = 

We shall consider the following model: 
MODEL 4 

(i) Conditional on 0, and <rf, y t /s are independent with 
yij I &h °1 ~ (i = 1, ■ . • , rn;j = 1 ? . . . f A^ t ); 

(ii) Oi&mx&T*), m); 

(iii) etf8/Gfo,fo-lK}. (/ = l,...,m) 

where the density of inverse gamma (IG) is given by /(of) = {(17 - l)f } n (l/<7f 
e -fa-iK/*?r(77), <T? > 0. Note that Ghosh- Meeden model can be viewed as a special 
case of Model 4 when // tends to infinity which implies of = f (i = i t m). Under 
the Model 4 and squared error loss i.e., L(a,~f) = rn~ l Y17Li( a i ~7i) 2 - the Bayes 
estimator of 7; is given by 

4 B) = Qui*;*} 

= ^' E £?{£(yiilw,tf.-,*r?' ; 0)| ffi ;^} 

Using (58), (75) follows 

e! S1 = (l-/i)Fi + /i(d-»i)»i + w.-xJi9) 

= (1 - fiWifti + /ftoii^, (76) 

where to,- = E[af/(af + n,r 2 ) | y,;tf>] and /; = (:V,- - n,)/iV,-, the finite population 
correction factor. Empirical Bayes estimation of 7,- = iVf 1 Ej^itfo can be found 
by replacing u>; with tt>,-. Noting that a-,- will be calculated using procedure in 
section 3.5. 



(75) 



60 

4.3 Measure of Uncertainty of Empirical Bayes Estimator 

It follows from Arora et al (1997) that a measure of variability of empirical 
Bayes estimator is given by 

Varf = Varf{ yi ;4i) = Varf(rAm;*) 
= N- 2 Var\ £ 9iJ \ 9s ;A 

= Nf* \e \Var{ £ Vij \ 9i .9 { . cr?; 0} | w; 0 

+ Kar £{ £ 9ij \ y h 0 in *f: 0} | } 

L ;=n,+i J J 

= N- l fiE(*l | Vf ; p) +y/{Kor[^ | Wf <r?;0) [ 

+£[Kar(<? l -|y l -,(7 I 2 ;0)|^;0]} 
= iVf7^(of 1 W ) + ff&i - ^) 2 Kar(fi,. | w; 0) 

+f?r 2 E(B i \y i ;v) (77) 

where ft = of/iaf+niT 7 )^ the expectations and the variance in the last equation of 
(77) are with respect to the probability distribution of of given y t - and V as in (54). 
We propose that using Laplace method, we calculate E[Bi | yi\xb) and V'«r(ft | 
y t ^) = E{Bf | K ;0) - [£(ft | y,-;0)P. We use equation (71) to find £(5/ 1 w; e) 
and £(ft? | where 6(<r?) = ft = [exp(~ Pi )(exp(- Pi + n.V 2 )- 1 ] 2 , 6'(<r?) = 

-2&(<r?) + 2[6(a?)]* and &>?) = 46(<7?) - I0[6(a?)]§ + 6[6(<r?)] 2 . 

A naive measure of variability of the empirical Bayes estimator ef B is obtained 
as Varf B {yi\\}>). Note that Varf B underestimates the true variability of ef B since 
it does not incorporate the additional variabilities due to estimation of 0. Prasad 
and Rao (1990) proposed method based on Taylor series expansion, captures the 
additional variabilities due to estimation of 4\ Kass and Steffey (1989) proposed a 
measure of variability for Bayesian approach using Laplace first order approxima- 
tion. In order to include the additional variability, one can extend the parametric 



61 



bootstrap method considered by Laird and Louis (1987) as in (74). 

We generated Oi independently from iV(/t, r 2 ) (i = 1, .. M 30) where /z = 10. r 2 = 
1.0 and <rf independently from ~ IG{r),[q - 1)£} for tj = 10, £ =4. We generated 
Vi ~ iWirf/m) and 5? = Z? mi (vu- Vi) 2 " *?x2,-i. We took it,- = 10 and 
AT; = 100 for i= l,..., 30. For bootstrap computation we draw R = 1,000 replica- 
tions for each density above with parameter 0. For computation of Of (y,-; 4 ,m r ) and 
Varf (y,-; we considered four cases. 

1. When 6 = Var(aj) < 0.0 and r 2 < 0.0, the bootstrap analogue of Model 4 
becomes ~ N(/t,£). Then 

i B (yr,r T ) = (l -m + fiK 

Varf{,ji-r r ) = N- l M;. (78) 

2. When £ > 0.0 but f 2 < 0.0 the bootstrap analogue of Model 4 becomes (i). 
y'i I *i '~ and (ii). a\' «? lG(ij,(fj - 1)0. Then the bootstrap empirical 
Bayes is 

i B {yi\k) = (I -fi)Vi + /<# 

V'«rf(j/.;VV) = Nr'fiE.tf' | ; 0 r "J. (79) 

where £.[<7?- | y,;<£ r -] = nilVi '~*'^$?? ir - l) ■ From (i). and (ii). we have the 
density /(a?- | oc ^"''^"exp-^n,-^- -/i) 2 +*?' + 2€« - 1)}- 

Now .E.fof" | y, r ;$ can be easily found. 

3. When 6 < 0.0 but f 2 > 0.0 the bootstrap analogue of Model 4 becomes 
(*')• Vij I ~ Af(tf?,e) and (ii). 0? JV(/i,f 2 ). Then the bootstrap empirical 
Bayes is 

Kar? fl (£) = Nr'/ii'+ffFrnr + TiiT 2 '); (SO) 
where u>; = t 2 '£'/(£' + n,r 2 -). 

4. When £ > 0.0 and f 2 > 0.0 then we use Model 4 with the usual and using 



62 



equation (76) and (77). In Table 25 we present the results of the measure of 
uncertainty of bootstrap method as well as naive method. In Table 25 we report 
the AAB, ASD, ARB and ARSD of different rj and A = £/r 2 . It turns out that 
the proposed estimator is uniformly better than the survey estimator. 



Table 25: Empirical Bayes Estimate ef B of 7,- where rj = 10 and f 













MSB{ef a ) 


XT 

No. 


7. 


y 




PTJ 

ef B 


Naive 


Bootstrap 


1 


10.204 


10.251 


10.408 


10.350 


0.202 


0.210 


2 


9.017 


10.251 


9.296 


9.698 


0.238 


0.231 


3 


9.780 


10.251 


11.141 


10.605 


0.375 


0.231 


4 


9.857 


10.251 


9.820 


10.007 


0.243 


0.213 


5 


9.943 


10.251 


10.204 


10.222 


0.203 


0.209 


6 


10.425 


10.251 


10.842 


10.635 


0.191 


0.218 


7 


9.929 


10.251 


11.098 


10.749 


0.231 


0.228 


8 


9.510 


10.251 


9.525 


9.793 


0.203 


0.221 


9 


9.234 


10.251 


10.377 


10.325 


0.227 


0.208 


10 


9.966 


10.251 


9.901 


10.021 


0.185 


0.210 


11 


9.903 


10.251 


10.417 


10.367 


0.160 


0.208 


12 


10.169 


10.251 


10.451 


10.378 


0.198 


0.212 


13 


10.092 


10.251 


9.791 


9.990 


0.241 


0.213 


14 


8.648 


10.251 


9.280 


9.778 


0.301 


0.234 


10 


O IW7 

if. Vol 


in oci 
lU.zol 


rt ceo 


9.S56 


0.185 


0.217 


16 


10.671 


10.251 


9.813 


9.992 


0.226 


0.214 


17 


8.590 


10.251 


7.314 


8.734 


0.326 


0.447 


18 


10.323 


10.251 


10.367 


10.322 


0.214 


0.207 


19 


11.173 


10.251 


11.501 


11.089 


0.184 


0.254 


20 


10.923 


10.251 


10.772 


10.564 


0.221 


0.215 


21 


8.601 


10.251 


8.994 


9.424 


0.191 


0.254 


22 


9.553 


10.251 


9.038 


9.594 


0.266 


0.248 


23 


9.825 


10.251 


11.038 


10.761 


0.193 


0.226 


24 


10.829 


10.251 


11.307 


10.826 


0.261 


0.242 


25 


10.418 


10.251 


10.685 


10.541 


0.180 


0.213 


26 


10.802 


10.251 


10.643 


10.510 


0.184 


0.213 


27 


9.844 


10.251 


9.930 


10.037 


0.179 


0.210 


28 


8.960 


10.251 


9.900 


10.054 


0.246 


0.212 


29 


12.627 


10.251 


11.710 


11.254 


0.175 


0.268 


30 


11.062 


10.251 


12.313 


11.528 


0.229 


0.331 



64 



Table 26: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7,- where 7 = 0 





A = 0.25 


A = 0.50 . 




y 




e EB 


y 


y> 




AAB 


0.6851 0.1497 0.1470 


0.687S 0.21 IS 0.2029 


ASD 


0.7762 0.033S 0.0325 


0.7S58 0.0676 0.0632 


ARB 


0.06S9 0.0150 0.014S 


0.0694 0.0212 0.0206 


ARSD 


0.0077 0.0003 0.0003 


0.0079 0.0007 0.0007 






A = 1.00 


A = 2.00 




y 






y 


Vi 


ef* 


AAB 


0.69S2 0.2995 0.2797 


0.7158 0.4235 0.3S36 


ASD 


0.S052 0.1351 0.1204 


0.S440 0.2703 0.2216 


ARB 


0.0708 0.0300 0.0284 


0.0731 0.0424 0.0390 


ARSD 


0.0082 0.0014 0.0013 


0.0088 0.0027 0.0024 





A = 4.00 




V Vi ef B 


AAB 


0.7590 0.5990 0.5212 


ASD 


0.9214 0.5406 0.3878 


ARB 


0.0781 0.0600 0.0533 


ARSD 


0.0099 0.0054 0.0042 



65 



Table 27: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7, where 
17 = 10 





A = 0.25 


A = 0.50 




y 






y 


Vi 


eP 


AAB 


0.66S0 0.1252 0.1222 


0.6759 0.2020 0.1997 


ASD 


0.7532 0.0230 0.0224 


0.7593 0.0665 0.0606 


ARB 


0.0667 0.0125 0.0122 


0.0682 O.0203 0.0203 


ARSD 


0.0073 0.0002 0.0002 


0.0076 0.0007 0.0006 






A = 1.00 


A = 2.00 




y 


Vi 


eP 


y 


Vi 


eP 


AAB 


0.6768 0.2695 0.2656 


0.6855 0.3SU 0.3535 


ASD 


0.7621 0.1176 0.1125 


0.776S 0.2352 0.2093 


ARB 


0.0685 0.0273 0.0270 


0.0698 0.03S7 0.0360 


ARSD 


0.0077 0.0012 0.0012 


0.0081 0.0025 0.0022 





A = 4.00 




v Vi tf B 


AAB 


0.7033 0.53S9 0.4607 


ASD 


0.S141 0.4703 0.3677 


ARB 


0.0721 0.0548 0.0471 


ARSD 


0.00S7 0.0050 0.0039 



66 



Table 28: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7,- where 
7 = 100 





A = 0.25 


A = 0.50 




y 




-EB 
i 


y 


Vi 


.EB 

c i 


AAB 


0.6669 0.1415 0.1344 


0.6610 0.2002 0.1866 


ASD 


0.7561 0.02S4 0.0276 


0.7538 0.0567 0.0534 


ARB 


0.0664 0.0141 0.0134 


0.0657 0.0200 0.0185 


ARSD 


0.0072 0.0003 0.0003 


0.0071 0.0006 0.0005 






A = 1.00 


A = 2.00 








.EB 


y 


Vi 


P EB 


AAB 


0.6900 0.2641 0.2619 


0.7059 0.3735 0.3582 


ASD 


0.7805 0.1149 0.1097 


0.8104 0.229S 0.2061 


ARB 


0.0710 0.0263 0.0262 


0.0722 0.0372 0.0360 


ARSD 


O.00S0 0.0011 0.0011 


0.00S6 0.0023 0.0021 





A = 4.00 




y Hi ef B 


AAB 


0.7354 0.52S2 0.4908 


ASD 


0.8769 0.4596 0.3685 


ARB 


0.0758 0.0526 0.0498 


ARSD 


0.0095 0.0046 0.0038 



67 

CHAPTER 5 
Empirical Bayes Estimation of Finite Population Variances 
5*1 Introduction 

In this chapter, we consider the setting of Chapter 4 but consider the estimation 
of the strata variances, i.e., 7,- = jVf 1 E^i(»«i - /<«) 2 , where & denotes the mean 
of the £th stratum (i = l,...,/?i). 

For the last fifteen years, there has been a growing demand from both the public 
and private sectors to produce reliable statistics for various subgroups of a finite 
population. According to Brakstone (19S7) ~there is in Canada r and probably in 
other countries too, an increasing government concern with issues of distribution , 
equity and disparity." Consider the problem of comparing the income distribution 
for various geographical areas of a country. Is it enough to consider just the per- 
capita income? Probably not, since two geographical areas may be comparable 
in terms of their per-capita incomes, yet they may vary considerably in terms 
of diversity which can be measured by the variances of their income distributions. 
Although the problem of finite population variances for different geographic groups 
is a very important problem, it has received relatively less attention than the 
problem of estimation of means, ratios and proportions for different geographical 
areas in the finite population sampling. 

Ericson (1969a) briefly addressed the problem of the Bayesian estimation of 
a finite population variance under simple random sampling. Datta and Ghosh 
(1993) provided a unified approach to the Bayesian estimation of different strata 
variances in finite population sampling under stratified random sampling. Ghosh 
and Lahiri (1987) considered the problem using a linear empirical Bayes approach. 
Lahiri and Tiwari (1990) proposed a nonparametric empirical Bayes estimation 
using the Dirichelet process prior (Ferguson 1973). 



68 



Note that the model considered by Datta and Ghosh (1993) does not incorpo- 
rate stratum specific random effects through the scale components. Although, this 
synthetic assumption may have insignificant effect in the estimation of different 
stratum means, it may cause unduly shrinkage in the Bayes estimator of different 
stratum variances. Ghosh and Lahiri (19S7) and Lahiri and Tivyari (1990) intro- 
duced random stratum effects through the scale parameters, but even then they 
failed to overcome the overshrinkage problem primary because of the linear nature 
of their Bayes estimators. However, we realize that the linear empirical Bayes 
procedure of Ghosh and Lahiri (19S7) and the nonparametric empirical Bayes ap- 
proach of Lahiri and Tiwari (1990) are very robust and it is difficult to resolve the 
problem associated with overshrinking without being specific about the distribu- 
tion of the stratum specific random scale effects. 

In section 5.2, we propose the Bayes and empirical Bayes estimator of the 
ith stratum variance. In Section 5.3. we present a measure of uncertainty of the 
estimators proposed in section 5.2. 

5,2 The Bayes and Empirical Bayes Estimation of the 
Strata Variances 

Recall the Model introduced in chapter 4 : 
MODEL 4 

(i) Conditional on 0; and of, j/,-/s are independent with 
yij\9i,<ri~N{0 h at), [i = I m;;= l,... f tfi); 

(ii) ^^H), (i=l m); 

(Hi) €F* W /G{i| t ( 9 - 1)0 (i=l,...,m) 

where the density of inverse gamma (IG) has mean £ and variance £ 2 /(? -2). 
Suppose a sample of size n,* is collected from ith finite population, i.e., we have 



69 



the observation vector yi = (jfa,.--,^,). The Bayes estimator of 7(2//), under 
squared error loss, is 

ef = E(li[yi) 1 iff). (SI) 
Let s,= {1, . . . , rti) be the set of labels of units in the sample. 

The goal now is to estimate 7,-, the variance for the tth finite population (small 
area). To get the Bayes estimator 7 ;(y.) = Nf 1 Ej=i (yij-Vif, Arora (1994) 
proved the following lemma. 

Lemma 2 : For any N real numbers, a,-, (1 = I , iV), 

iV ,v tW 

^(a.-a,) 2 = 2Af£(a,-a)\ ichere a = N~ { J^a . (82) 



Using Lemma 2, 7,- can be written as 



Three cases are possible: 



i<j*j'<K 



(i) j 6 6, anrf/ € s,, 
(«) 7 6 5, 6uZ / £ 5 n 
(m) j £ Si and f g s t -. 



Equation (S3) can be written as 



1 v-2 



£ (fO ~ Vij') 2 

J=I j'=n, + l 



".+i<;*;'<' v . 



(S3) 



(84) 



70 



Using (84) in (81), one gets. 



1 



j=l j'=n.+l 



Expectation in the second term of (85) can be written as 

E[U9ii-9i)-{VW-0i)} 2 \Vi] 
= E[( Vij - 0,) 2 | yi ) + E[{y iy - 0 ( ) 2 | yi] 

-2E[(M U -0 i )(y i j.-0,)\ 9t \. 



First term in (S6) i.e., 



(So) 



(86) 



tin - m I yi)} 2 + Em - m I y ,)} 2 1 yi ) 

-2E[{y !} - £(<?,• | y ; )}{0,- - E& | y,)} | y,] 
[2/0 - W |y,)) 2 + ^ar(d, U,). 



(87) 



It was shown in Chapter 4 (Mean estimation problem), that E(6t \ y,) =s (1 - 
»i)Fi + Wi^, where w, = E[b(<xf) | y,], with b{aj) = (1 + n I r J a?)- 1 . E represents 
the expectation using Laplace method. Thus, the first term of (86) becomes 

= Var(0i | y.) + [ Vij - ( I - Wi Jy. - u,,-*;./?)* 
= l/otr(0, | 7,,) + { yij - y. + u V (y f - x^)] 2 
= Var(0; | j,,) + (y (V - y,) 2 + w j(y { - x '.j3)* 

+2(>Mj ~ VM'AVi - *&) (SS) 



71 



Second term of (86) 



= E{°l\yi)- (89) 



Third term of (86) 



= E[(yij){yij> - Oi) | y t ] - E[0i( yij . - 6i) | Ui] 

-£ [E{0 i {y ij .-O i )\y i ,0,o*}\y t \ 
= 0. (90) 

Using (88), (89), and (90) in (86), second term of (85), where j € s { , j' & i.e.. 

Elhu-yvflsi] 

= VariOilyJ + Etflvd + iyii-fi) 2 

+wf(y i - xtf) 2 + 2wi( yij - - 40). (91 ) 

Now, the third expectation term of (85) is E[(pij - y^v) 2 | y.-J, where j,/ £ s,- 

-2£[(tfy - 0i)(y iy - 0.) | y.) (92) 

First term in (92) 

= E[E{(y ii -O i ) 2 \y i ,alO}\y] 

= E[af\ui). (93) 
Similarly second term in (92) 

E{°1 I if<). (94) 



72 



The third term of (92) is 

= B[( yij )( yii . - Oi) | Vi \ - EiOiiyv - Bt) | yi ] 
= E[{y ij )E{y ii .-B i )\y i ,e.a 2 i )\y i \ 
-E[E{O i {y ii .-B i )\y i ,B^}\y i \ 
= 0. 

Combining (93), (94) and (95) the tiiird term of (86) becomes 
Now, using (91) and (96) in (95), the Bayes estimator of 7,- 



E (to - to') 2 

J<J*J'<n, 

+ 2 E e {r«tf.-if.-)+s(o?i*-) 

J=l i'=n,+l 

+(2/,v-F,) 2 + u;?(F,-^) 2 
+ 2itv(tf IJ -F,)(27,-^)} 
+ 2 E • 

n.+I<M/'<A/, 

Using a? = (n,- - l) _l ££, (y 0 - ft) 2 , one gets from (97) 

ef = tff 2 [n,(n, - + (N ( - n t \mVar($i \ Vi ) 
(N { - ni) ni E(c? I y { ) + (m - l)(Ni - n,)s? 

+2(Ni - ni ) Wi Z( yij - Mfi ~ 40) 
(yV t --n i )(yV l -„ i -l)£;( ( T ( 2 |j,,)] 
= Nf* [mm - l)sf + (Ni - mfaiiVariOi \ «,,-) 
+««?(& ~ x-Uf} + W - mm - l)E{af I yi) 



(95) 



(96) 



(97) 



(98) 



Using the notations of Chapter 4, ft = (<V f - n,)//V f , the Bayes estimator of 7| 
can be written as 



ef = /Vf l (n, - I)* 2 + f t ( I - fi){Var(0i \ yi) + u-?(y ; - x'J) 2 } 



(99) 



The empirical Bayes Estimator for 7; under quared error loss function can be found 
by replacing tp with ^ in equation (99). 

5.3 Measure of Uncertainty of Strata Variance 
Estimator 

Under the Model 4 . variance estimator of iYf ' £^L, (Vij - Vif is given by 

^ar|iVr I £(yo-F i ) 2 |y.| 

+ £ V'ar - fc) 2 | 0„ a?; J | y.J j . (100) 

The first term of expectation of (100) follows from derivation of Section 5.2. after 
some algebra, however the super population model of given all others are known 
i.e., 

= N-^mim - l)si 2 + (Ni - n;)(Ni + m - l)af}. 



(101) 



Now, using (101) the first term of (100) becomes 

m^ + n i -\) 2 Var{af\y i ) 
The second term of variance part of ( 100) 



(102) 



= Var 



74 



+ £ to - y.v ) 2 1 y.» 0;. °h 1> \ 

ni+l<j^j'<Ni J 

+ £ too - »«• ) 2 1 y .1 0" ^, ? ; V> > 

n.+l<i?V<<V. J_ 

= j/^f^jaE E to-wO'lwA*?;*} 

+Var i £ to - 2/ 0 , ) 2 | if,-, Oi, *l, i> \ 

[n,+l<M<N, J 
^ >=lj'=n,+l J 

{ E iVU-Vi?) 2 \to0i>°fi1>)\- 

{n.+l<jfr'<X. J J 



The first term of (103) 



l/ar |E £ to-M'lw.*,*?;*} 

I j=I >'= n ,+l J 

= E { E E ( f o" - y «>' ) 2 1 i/« > ^ J 

U=l >'=n,+l J 

(n. M "J 2 

££ E (w-w) f lwA«fc*> - 
i=lj'=n,+l J 

The first term of (104) i.e., 

= £ E E to-M 2 U,0.,* 2 ;4 
= ^{e E [to-*.) 2 + to'-0.) 2 

U=l j'=n, + l 

-2 to " *i)to ~ fc)] I yii0i,ahv>} 2 

= £<£ £"to-*.-) a I 

^■=lj'=n.+l J 



(103) 



(104) 



10 



f n, /V, ) 2 

E E lvir-0i) 9 \9h0 h of;1>\ 

[j=lj'=n,+l J 
f »i N, 1 2 

+ 4jE? E E (»u-^-)(yy-*i)l».-A»?;0} 
+25 ( f; 2 (y»-0i) 2 t: E fop - *) a I w. 

U=li'="i+I /=li'=n,+I J 

- 4 ^|E E (»ii-«.) 2 E E {Vii-0i)iVir-0i)\9i,8i.of:ii] 

-4£ ( f; £ (yo-^.) 2 E E (w-Mirv-WlMi.^;*) 105 ) 

l.j=ii'=n.+i i=ii'=nj+i J 



The first term of (105) 



E E tof-ftf I *} 

= 5 joV, - m) £ ( yij - 9 { f | y;, Oi, ah 0 J 

= (M-mf* few -#,)« + 2 £ (fo-^W*-*)* I 4 

U=» l<j<i<r>, J 



= (iV ; -n,) 2 n,^ 4 + (n,-l)a/]. 



The second term of (105) 



(106) 



s(e E for-*) 1 1 

= *?[(#,- - n;)in + (iV,- - n,)(/V ; - n; - l)tr/] 



(107) 



The third term of (105) 
E 



{E E (w -*.-)> \ui,0i,°h4> 



76 



i=l )'=n.+l ) 

Bust* -ft) 1 E for -ftf} I 

J=l i'=n.+l J J 

I I>o - ft) 2 E (y» - ft )(*> - ft) } I vi-. ft* <rh * 
I E (ya-Oi) 2 E (*«-ft)(w-ft)}lw.ft.«?;*l 
| E (yy-ft)fo«-ft) E fa»-ft)(vf,-ft)}lir.-.ft.»?; 

ll<W<«i n,+l<<.-5£p<iV, J 



+E 



(108) 



expectation of the last term in ( 108) is zero since — 5,- and yiy — 0-, given 
(yiift>^i;#) are independent. The fourth term of (105) becomes 

= 2E\(N i -n i )jr(y ij -0 i ) 2 n i £ (»«• " ft)' I 

= 2n,(M-n.)^(E(^-ft) 2 £ for -ft)* I W, ft.*?;*} 

= 2n, (Ni - n,)£ | £ ( yij - 0.) 2 1 6 iy *fr * J £ j £ ( " ft) 2 1 ft, 0 J 



2n?(M -n i )V. 



The fifth term of (105) 



(109) 



We E (yo-ft) 2 E E (w-ft)(»v-ft)lw.ft.«?;*| 

Ij=» J'=ni+t j=l>'=n,+l J 

= ^j(M-n.)E(^-ft) 2 E(^-ft) £ foy-ftHjK.ft,*?;*} 
= (^-n,)£?{E £ (^-ft) 3 (yo'-ft)|ynft^, 2 ;V'l 

[j=lj'=n i + l J 

+W 2 E E (yo-ft) 2 (^-ftK^-ft)l^.ft,^^} 

I lS><*.-<n. j'=n. + 1 J 



= 0. 



(110) 



The sixth term of (105 J 

£<E E (yv-Oiff: E tou-Ww-9i)\KJhohi] 

\j=l)'=n,+l j=lj'=n,+l J 

U=l i'=n,+l >'=n.+l J 

= ".^{e(^-^) E (*««-*) 3 1 

U =1 ;'=n.+l J 

+n { E I f>, v - 9,-) Y, (y* - *.■)(»«» - I y< . f \ 

n.+l<j'?k<Ni J 

+n>E \ £> 0 - 0i)2 £ (y iy - 9i)(y ik - 0,)(y, 7 - 0.) | y h 0 h crj: 4 I 
li=l j'<k<l J 

= 0. (Ill) 

Using (106)-(111), then the first term of (104) i.e., 
( n, A', "J 2 

* E E (w-wflwA*?;*} 

= W - ntfmfa + (n,- - 1 )*?) + n?(JV f - n,)^ + (iV ; — n,- — I )af] 

+4n.-(M - m)&l + 2n 2 (A',- - m) 2 af 
= m(Ni - n,)[Ntm + (4<V,»,- - 4n 2 + 4 - ( L12) 

The second term of (104) 



lj=lj'=n, + l ) 



= E E Eii 9ii -y ij .) 9 \9 h $ h oti4,} 

= E E {E[(y ij -0 i )-(y ij '-O i )) i \y i ,e h <Thi>} 

j=li'=n, + I 

-2 £{(y <y - 0,)(y,y " 0i) | «/., 0i,<r 2 ; 0}] 
= 2ni{Ni-ni)aj. (113) 



the third term in the above is zero since conditional on (y,-,0i,<7 t 2 ) tjij - 0; and 
y'ij — 0i are independent. Using (112) and (113), then the first term in (103) is 

= 4{m(Ni - m)[Nim + (4JV,n,. - 4n 2 + 4 - Ni)fff] - (2n,(iV,- - m)cf) 2 } 
= 4n,(iV,-n l )[iV i ^-(7V,-4)a/]. (114) 

The second term of (103) i.e., 



Var 



= Var 



E [ytj - un') 2 1 tf.-> Si, ° 2 \ $ 

.".+I<J9*i'<'V. 

Wi-m) E (yij-Vf\yiJi,*h<!> 

J=n.+l 



= 4(iV,--n,) 2 



[j=n,+l ) 

I ;=»»,+ 1 J 
where = (;V, - n,) -1 E^n ( +i Vij> The first term of (115) becomes 

= e\ ^[{yii-Qi) 2 \ yi A,vh*\-W-nm-Oi) 2 \yiA,<rhA 

= W E (w-**) 2 lw,ft»» a ;i4 +OV i -n i ) 2 £{(p<-0 l ) 4 |y.-,0.^ l 2 :^ 

U=n.+1 J 

-2W-m-)£j(yJ E (yo--fc) a lwA»?;*}. (H6 



The first term of (116) becomes 



f .v. 

= £<[ E l9tj-0 t )*\Vh*i,o?',+] 

[ i=n, + l 

+[ E (vu - ^.) 2 (f o' - Oif I fc, 0] ) 

n,+l<j*j'<N. J 

+ E E{(i/ij -0i) 2 (y v -Oi) 2 | ViA-ah*) 
".+i<Mj'<.v. 



79 



= {Ni-ni^ + iNi-m-itf]. 

The second term of (116) 



(117) 



= (Ni — rii)~ 4 E | 



[ E (w-ft)} ( E (w-ft)j I W. 

U=ni+1 J ti=n.+l J 

' * 1 

E (y.;-*.) 2 + E (yo -Wo'-M 



E (y* E (v» ft) 

fc=7ii + l n t +l<5^Jt<iV, 

= - n,-)- 4 {(M - n,) /l4 + (jV; - n,)(A', - n, - l)<r? 

+2(iV 1 -n 1 )( J V i - Ri -l) < 7;'} 
= (^~n,)- 3 {/Z4 + 3(JV,-n f -l)atf}. 



(US) 



The third term of (116) 



( 



E (yo - *.) 2 + E (yy - «.)(yo- - «.■) 

i=n, + l n, + !<;"*.;'< AT, 



E to -ft)* I 



= (^-^^{(M-R.^ + fiVi-n.-KA'i-n.-l)^} .. 

= (iVi-«i)" I {w + W-n l --l)<r/}. (119) 

Using (117), (118) and (119), then the first term of (115) becomes 

= (N { - m)[n4 + (M - m - + (Ni - ni)- l {n 4 + 3(JV, - rii - l)<r/} 

= [Ni - rii - l) 2 /[Ni - n,-)^ + oft/VJ - n, - 1)[(JVJ - n.) + 3/(;V, - n.) - 2J. 

(120) 

The second term of (115) i.e., 



e\ E to/ I 

lj=n. + t J 



80 

= E {E(ya - Oi) 2 1 y^o h ah - (N< - «,)£{(£ - 0,f 1 y; , ft, of: *} 

= {Ni-n^cf-af 

= (7V,-n,-lK?. (121) 
Now, using (120) and (121) then the second term of (103) i.e., 

\n.+\<j^j'<N. J 

= 4(;V f - ni ) 2 [(Ni - m - 1) 2 (;V,- - n,)-V 4 + <*\(Ni -m-l) 

(Ni - m) + 3/(Ni - m) - 2] - (Ni - m - if of] 
= 4(iV ; - ni )(Ni - m - i){(Ni - m - i)m - (Ni - m - Z)*?}. (122) 

The third term of (103) becomes 

= ME E (itf-W-) 1 I 

U-t>'=n.+l J 

I E (Vij-yv? \Bh9uo?;+\\ 

[ni+l<j*j'<ft, J J 



"ME. E (w - w ) f I w. 

W E (yo-i/.i') 2 |y.,ft,^;^|. (123) 

ln,+l<Mj'<<V, J 



The first term of (123) becomes 



n.+l<Mf'<iV. J 

(n, iVi 
E E [(*;-0;) 2 + (i^ 

E (W ~ *«) 2 + (2/0' - Of - 2(2, 0 - - 4) | Wl ft, <r?; * } 

"|.+1<JW<AT. J 



SI 



E toy -ft), E too- ft) 2 + E to' -ft) 2 

j'=n,+l n.+l<J#j'<iV. <»i+l<jW<(Vj 

-2 E (w - ft)foo' - ft) I f i. *h°b + \ 

= EUNi-rtjrtou-tif + m E to -ft) 2 
-2Eto-ft) E to'-fthW-n.-l) 

i=l j'=n,+l 

E to - ft ) 2 - 2 E (? o - ftito- - ft) I ft* *, ? ; * J 
= 2(iY i -n i -i)(iv i - ni )£:{E(^-^) J E to-ft) 2 !^,* 2 ;*) 

+2n,-(W - n, - 1 )£ { £ to " ft) 2 E to " ft)' I W. ft' * | 

(,J=n.+l J=n,+I J 

-4(/v, -n, -i)£:{E(^ -^) £ to -ft) E to-ft) a l*.ft.*?;*> 

U=t >=n,+l J=n.+l J 

-2(M- " ru)E ( £ (»u ~ ft)' E to - ft )(V 0' " ft) I ft, ft, * 1 

-2n,£| £ to -ft) 2 E to-ft)to'-ft)lft,ft,* 2 ;«4 

U'=n.+1 ".+1<jW<jV. J 

+4^{E(yo-ft) £ (tfO'-ft) E to-ft)to'-ft)lft,ft,^<4 

[j=l i'=n,+l n.+l<jft'<Ni J 

= 2(# f - - - l)n,(M - n,)^ 

+2n,(Afi - n,- - l){(yV,- - + (/V,- - R ,)(^ - «, - l)*?}, ( 124) 

since the third to the sixth term in the above are zero. Note that the second term 
of (123) follows from equation of (113) i.e., 2n,(A ; , — n,)of and equation of (121) 
i.e., (Ni — ni - l)of. Then, using (124), the third term of equation (103) becomes 

&nt(Ni - ni){Ni - n, - l)( t t 4 - <rf). ( 125) 



Using equation (114), (122) and (125), then the second part in variance of (100) 

= itff 2 [4(iV f - ni)ni(N ifl4 - (N; - 4)<r/) 

- ni )(Ni - m - l){(Ni - m - l)/i 4 - (Ni - m - 3)af] 
+ Sm[Ni - m)(Ni - n { - l)(/< 4 - er?)] . ( 126) 

Note that under normal distribution m = 3<rf, equation (126) becomes 

Nr 2 ^{Ni-n^miNi + 2)a/ + 2(i\'i - «,) 2 (yV ; - n f - l)a/ 

+ n,(iV i -n,)(iV,-n 1 -l)4a/] 
= A^V; - m)af [2n,(AT ; + 2) + 2(iV t - - n, - l)(iV, - n,)(A', + n,)] 
= 2Nrif i atC i , (127) 

where C; = n,(iV ; + 2) + (N { + l)(iV; - n t - - I). Finally, using (102) and (127) then 
the variance of the ith population variance estimator is given by 

Vor | iVJ" » S (M - F«>* I » J 
= AT 2 [fHXi + n, - ifVartf \ Vi ) + NT l ftCiBtf | R )j . ( 12S) 



Table 29: Empirical Bayes Estimate ef 3 of 7,* where rj = 10 and f = 4 



No. 






GL 


e a {MSE) 

V / 


Naive MSE 


Boots MSE 


1 


1.913 


1.023 


2.388 


1.578 


0.236 


1.861 


2 


4.006 


8.814 


3.095 


6.183 


3.113 


1.989 


3 


6.264 


9.868 


3.184 


6.653 


3.531 


1.904 


4 


2.741 


3.617 


2.621 


3.074 


0.806 


2.210 


5 


1.264 


1.414 


2.424 


1.807 


0.302 


2.171 


6 


2.042 


2.527 


2.526 


2.468 


0.537 


1.743 


7 


1.999 


1.570 


2.441 


1.914 


0.337 


1.726 


8 


1.659 


1.277 


2.417 


1.754 


0.289 


1.700 


9 


3.049 


1.691 


2.458 


2.027 


0.380 


1.970 


10 


1.122 


0.971 


2.384 


1.551 


0.229 


1.845 


11 


1.050 


0.954 


2.382 


1.538 


0.226 


1.593 


12 


2.229 


3.991 


2.655 


3.292 


0.917 


1.975 


13 


1.901 


2.631 


2.533 


2.507 


0.550 


2.082 


14 


4.619 


3.034 


2.580 


2.844 


0.715 


1.732 


15 


1.828 


1.440 


2.427 


1.826 


0.308 


1.874 


16 


3.566 


3.197 


2.606 


3.071 


0.851 


1.888 


17 


2.211 


1.296 


2.418 


1.762 


0.291 


1.930 


18 


2.417 


0.785 


2.372 


1.461 


0.208 


1.881 


19 


1.398 


1.126 


2.400 


1.650 


0.257 


1.822 


20 


1.162 


2.040 


2.482 


2.186 


0.430 


1.833 


21 


1.015 


0.952 


2.393 


1.586 


0.243 


1.833 


22 


1.482 


1.184 


2.403 


1.673 


0.262 


2.043 


23 


2.816 


5.411 


2.784 


4.125 


1.412 


2.013 


24 


3.458 


5.506 


2.794 


4.193 


1.460 


1.864 


25 


0.809 


1.034 


2.389 


1.585 


0.238 


1.885 


26 


2.636 


1.667 


2.452 


1.986 


0.362 


1.797 


27 


1.516 


1.459 


2.430 


1.843 


0.314 


2.014 


28 


2.646 


1.878 


2.467 


2.086 


0.393 


1.798 


29 


1.774 


2.375 


2.549 


2.716 


0.702 


1.932 ! 


30 


2.031 


1.978 


2.475 


0.138 


0.411 


2.997 



84 



Table 30: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7,* where 7 = 5 





A = 0.25 


A = 0.50 




S 2 


sf 


GL 


Proposed 


S* 


Sf 


GL 


Proposed 


AAB 


0.1364 0.1118 0.1236 0.0736 


0.2729 0.2236 0.2472 0.1459 


ASD 


0.0279 0.0235 0.0225 0.0105 


0.1116 0.0940 0.0898 0.0417 


ARB 


0.6506 0.3867 0.5S24 0.2921 


0.6506 0.3S67 0.5823 0.2905 


ARSD 


0.1101 0.05S9 0.0SS1 0.0323 


0.2202 0.1178 0.1761 0.0641 






A = 1.00 


A = 2.00 




S 2 


Sf 


GL i 


Proposed 




Sf 


GL 


Proposed 


AAB 


0.5457 0.4473 0.4943 0.2876 


1.0915 0.S945 0.9SS1 0.5649 


ASD 


0.4464 0.3759 0.3591 0.1642 


1.7S57 1.5036 1.4353 0.6469 


ARB 


0.6506 0.3S67 0.5821 0.2S7S 


0.6506 0.3S67 0.5818 0.2S42 


ARSD 


0.4405 0.2355 0.3520 0.1267 


0.SS10 0.4710 0.703 1 0.2495 





A = 4.00 




5 l Sf GL Proposed 


AAB 


2.1830 1.7890 1.9750 1.1161 


ASD 


7.1428 6.0143 5.7355 2.5685 


ARB 


0.6506 0.3867 0.5813 0.2811 


ARSD 


1.7619 0.9420 1.4040 0.4927 



Table 31: The Average Absolute Bias, Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7,- where 
7/ = 10 





A = 0.25 


A = 0.50 




S 2 


Sf 


GL 


Proposed 


S 2 


Sf 


GL 


Proposed 


AAB 


0.1190 0.1227 0.1039 0.0S17 


0.2379 0.2453 0.207S 0.1628 


ASD 


0.0225 0.0333 0.0171 0.0104 


0.0899 0.1331 0.0685 0.0412 


ARB 


0.546S 0.3674 0.4S36 0.3173 


0.5468 0.3674 0.4S35 0.3166 


ARSD 


0.0915 0.0772 0.0721 0.033S 


0.1829 0.1545 0.1441 0.0674 






A = 1.00 


A = 2.00 




S 2 


sf 


GL 


Proposed 


S 2 


Sf 


GL 


Proposed 


AAB 


0.475S 0.4906 0.4155 0.3230 


0.9517 0.9S12 0.8304 0.6371 


ASD 


0.3595 0.5322 0.2738 0.1629 


1.4381 2.1290 1.0937 0.6385 


ARB 


0.5468 0.3674 0.4S34 0.3154 


0.546S 0.3674 0.4830 0.3130 


ARSD 


0.3659 0.3089 0.2879 0.1340 


0.7318 0.6178 0.5748 0.2650 





A = 4.00 




S 2 Sf GL Proposed 


AAB 


1.9033 1.9624 1.65S5 L24S6 


ASD 


5.7522 8.5160 4.3651 2.4777 


ARB 


0.546S 0.3674 0.4S23 0.3092 


ARSD 


1.4635 1.2356 1.1465 0.51S9 



86 



Table 32: The Average Absolute Bias. Average Square Deviation, Average Relative 
Bias and Average Relative Square Deviation of different estimates of 7,- where 
7 = 100 





A = 0.25 


A = 0.50 




S 2 


Sf 


GL 


Proposed 


S 2 


sf 


GL 


Proposed 


AAB 


0.0362 0.0946 0.0343 0.0369 


0.0723 0.1S93 0.0686 0.0737 


ASD 


0.0019 0.0125 0.0017 0.0019 


0.0074 0.0502 0.0069 0.0077 


ARB 


0.1472 0.3660 0.13SO 0.1491 


0.1472 0.3660 0.13S0 0.1491 


ARSD 


0.0077 0.0005 0.0070 0.0079 


0.0154 0.0963 0.0139 0.015S 






A = 1.00 


A = 2.00 




S 2 


sf 


GL 


Proposed 


S 2 


Sf 


GL 


Proposed 


AAB 


0.1446 0.37S6 0.1374 0.1474 


0.2893 0.7571 0.2753 0.2947 


ASD 


0.0297 0.2006 0.0276 0.0309 


0.1 1S8 0.8024 0.1108 0.1236 


ARB 


0.1472 0.3660 0.13S1 0.1490 


0.1472 0.3660 0.1384 0.1489 


ARSD 


0.0307 0.1927 0.0279 0.0316 


0.0615 0.3S53 0.0561 0.0631 





A = 4.00 




S 2 Sf GL Proposed 


AAB 


0.5785 1.5142 0.551S 0.5S91 


ASD 


0.4751 3.2096 0.4454 0.4938 


ARB 


0.1472 0.3660 0.13S6 0.14S9 


ARSD 


0.1230 0.7706 0.1125 0.1261 



CHAPTER 6 



Beta Binomial in Finite Population Sampling 
6.1 Introduction 

The methods considered in Chapter 4 and Chapter 5 are valid when the obser- 
vations are measured in an interval scale. Despite the importance of the analysis of 
binary data in finite population sampling there is very little emphasis on empirical 
Bayes estimation which fully specifies a binary model in the estimation procedures. 
The linear empirical Bayes method of Ghosh and Lahiri (19S7) can be used to an- 
alyze binary data from a stratified simple random sampling. But due to the use of 
a robust model, their method does not capture the special feature in binary data. 
Farrel et al (1992) considered an empirical Bayes method to estimate female la- 
bor force participation rates for small areas in the United States. They, however, 
did not consider empirical Bayes estimation in finite population sampling. Also, a 
measure of uncertainty of the empirical Bayes estimator which captures all sources 
of variabilities has not been proposed in the context of finite population proportion 
estimation. 

In section 6.2, we consider the Bayes estimation of finite population proportion 
from a stratified simple random sample. We consider estimation of prior param- 
eters in section 6.3 and propose an empirical Bayes estimator of finite population 
proportion. Finally in section 6.4, we propose a measure of uncertainty of empiri- 
cal Bayes estimator. The proposed measure of uncertainty captures all sources of 
variation. 

6.2 The Bayes Estimation of y { = jVf 1 Vij 

Let yij denote the value of a characteristic of interest for the jih unit of the 



tth area (t = 1, . . . , m;j = 1, . . . , jV,-). We shall consider the following model: 
MODEL 5 

(i) Conditional on 0, , y,*/s are independent with 

Vij I Oi ~ Bernoulli^), (i = 1, . . . , m: j = 1, . . . , AT,); 

(ii) Oi'tf Beta(aJ) y (t = L... f m); 

Our objective is to estimate 7,- = :Vf l J^j^i tlie finite population proportion 
for the ith stratum (i = m). 

Under the Model 5 and squared error loss, the Bayes estimator of 7- is given 

by 



ef = E(7,lw;fl 



= Nr l 



v, 



j=l j=m+l 



n, .V. 



(129) 



Using standard Bayesian calculation (see, e.g., Berger (1985)), it can be shown 
that 



rii _ a + 0 a 



= (1 - *>i)fi + m— 7-5, 

Or + p 



(130) 



where 0 = (a,/?), u/, = rf&3 an d = n r» t ^ Using (130) in (129) then 



ef = (I-Zili+Ml-^Ji + tti^-) 



<* + /?' 



= (1 - fiU>i)tJi+ fiWi 



a + 3 



a + 8 



(131) 



Note that £?,• = is decreasing function of n,-. 



S9 

6.3 Empirical Bayes Estimation of 7, = iV)" 1 EjSi Vij 

To estimate 0 = (or,#), we use the method of moment. Note that y = 
n~ l ££, E"i, Vtf = n _I ntfi where n = n f . Since £(y) = ^ and 
MSW = (n - m)- l E£,i:"ii(yij - Vi) 2 . Note that £(M5W0 = a 2 where 
a* = £Kar(j, 0 - I ft) = (o+g ,° < f +g+1) . Thus by equating ^ = y and ^gfe^ = 
A/5VK, we get 

>_ y Er=,nii(yo---y,) 2 

(n _ m)m _ y) _ E m t E n, _ _ )2 

(n - m)y(l - F) - ££, L%i(Vij - fc) a 

Now the empirical Bayes estimation of 7, = N^ 1 y>j is given by 

ef B = (l-B ; )y i + B : ^ 
a + 0 

Note that it is possible to have d < 0 (which also implies & < 0). In that case, 
we replace the Beta-Binomial model by the following model: 

(i) Vij I Si '~ BtrnouUi{Oi), (, = 1 m;j = 1 yV,); 

(ii) 0.~£/(O,tt), (<=1 m),where u is an unknown paremeter and 

(truncuted at u) 0 < u < 1; 

Then posterior distribution of 0 t is truncated Beta with parameters + 1 and 
rii - yi + 1. Then | y { \ u) = £ f{0i | y s ; u)d0 { and can be easily found from the 
Tables for Incomplete Beta Integrals. The unknown parameter u can be estimated 
by u = 2y. 

6.4 Measure of Uncertainty of ef B 

A measure of variability of ef B is given by 
Varf = Varf^^^Vavf^^y^v) 



(132) 



(133) 



» 

t i ; it 



90 



Nr*Varl £ mi \ 



>=»>•+« 



+ Var 



U{ E wlwA*}!*;* } 

L i=».+» J J 



= tff 2 [(M - n.) W(l - 0,-) | y,-; 0) + (/V,- - n,)Var(0, | yr , $)) 

'' n,- + a + /?n; +a + 0 + l Ji (n { + a + y3) 2 n, ; + a + # + 1 

- n^i:^, 1 ::^, ^ 1 ^^^^)!- <>«> 



A naive measure of variability of the empirical Bayes estimator cf B is obtained 
as Varf 5 (y,;^). Note that Varf B underestimates the true variability of ef B since 
it does not incorporate the additional variabilities due to estimation of 0. 

Equation (10) of Laird and Louis (19S7) can be extended to arrive at the 
following measure of variability of cf B : 

R 



Varf B m R-^Varf( y a;) + (R-l)-\^f(yiM-^('M)}^ d») 



R 

I 

r=l r=l 

where ef (y,) = J? -1 £f =1 ef (y„ and & is an estimate of ip based on the rth 
bootstrap sample. 



91 



BIBLIOGRAPHY 

Arora, V. (1994), "Empirical Bayes and Hierarchical Bayes Estimation of Small 
Area Characteristics," Ph.D. thesis, University of Nebraska Lincoln, Dept. 
of Mathematics and Statistics. 

Arora, V., Lahiri, P. and Mukerjee, K. (1997), "Empirical Bayes Estimation of 
Finite Population Means from Complex Surveys," to appear in Journal 
of American Statistical Association. 

Battese, G.E., Harter, R.M. and Fuller, W.A. (19SS), "An error-components Model 
for Prediction of County Crop Areas using Survey and Satellite Data." 
Journal of American Statistical Association, 80, 2S •— 36. 

Berger, J.O. (1985), " Statistical Decision Theory and Bayesian Analysis'" (2nd 
edn.), Springer- Verlag. 

Brackstone, G.J. (19S7), "Small area data: Policy Issues and Technical Chal- 
lenges," Small Area Statistics: An International Symposium, Eds. PlateL 
R., Rao, J.N.K., Sarndal, C.E., and Singh, M.P., Wiley, New York, pp. 
3-20. 

Carter, G.M., and Rolph, J.E. (1974), " Empirical Bayes Methods Applied to 
Estimating Fire Alarm Probabilities", Journal of the American Statistical 
Association, 69, SSO - 885. 

Chaudhuri, A. (1992), " Small Domain Statistics: a review", Technical Report 
ASC/92/2, Indian Statistical Institute, Calcutta. 

Cohran, W.G. (1977), tt Sampling Techniques," (3rd edn.), New York: John Wiley. 

Cox, D.R., and Reid, N. (19S7), "Parameter Orthogonality and Approximate Con- 
ditional Inference, " Journal of the Royal Statistical Society, B,49, 1-39, 
(with discussion) 

Cressie, N. (1992), "REML Estimation in Empirical Bayes Smoothening of Census 
Undercount, " Survey Methodology. 18. 75 - 94. 



Datta, G.S., Ghosh, M. (1993), tt Bayesian Estimation of Finite Population Vari- 
ances with Auxiliary Information," Sankhyd: The Indian Journal of Statis- 
tics , B, 55, 156 - 170. 

Datta, G.S., Ghosh, M., Huang, E.T., Schultz, L.K., and Tsay, J.H. (1992), "Hier- 
archical and Empirical Bayes Method for Adjustment of Census Under- 
count: The 1988 Missouri Dress Rehearsal Data " Survey Methodology, 
18,95 -108. 

Datta, G.S., Ghosh, M., Nangia, N M and Natarajan, K. (1996), "Estimation of 
Median Income of four Person Families: a Bayesian Approach," Tech. 
Report #429, Department of Statistics. University of Florida. 

Datta, G.S., and Lahiri, P. (1997). "Second Order Approximation to the Mean 
Squared Error of EBLUP in Small-Area Estimation Problems," unpub- 
lished report. 

Dick, P. (1995), "Modelling Net Undercoverage in the 1991 Census," Survey Method- 
ology, 21,45 - 54. 

Efron, B., and Morris. C. (1975), "Data Analysis Using Stein's Estimator and its 
Generalizations. "Journal of American Statistical Association, 70,311 — 
319. 

Erdelyi, A. (1956), u Asymptotic Expansions," New York: Dover. 

Ericksen, E.P. (1974), U A Regression Method for Estimating Population of Local 
Areas," Journal of American Statistical Association, 69,867 —875. 

Ericson, W.A. (1969a), "Subjective Bayesian Models in Sampling Finite Popula- 
tions," Journal of the Royal Statistical Society, B, 31.395 - 233. 

Ericson, W.A. (19696), "A Note on the Posterior Mean," Journal of the Royal 
Statistical Society, B, 31,332 - 334. 

Farrell, P., MacGibbon. B., and Tomberlin, T. (1979), u Empirical Bayes Estima- 
tors of Small Area Proportions in Multistage Designs.* Working Paper 



93 



Series, Faculty of Commerce and Administration, Concordia University. 

Fay, R. E. (1987), "Application of Multivariate Regression to Small Domain Esti- 
mation, " Small Area Statistics, Eds., R. Platek, J.N.K. Rao, C.E. Sarn- 
dal, and M.P. Singh. Wiley. New Tork, pp. 91 - 102. 

Fay, R. E., and Herriot, R.A. (1979), "Estimates of Income for Small Places: 
an Application of James-Stein Procedure to Census Data, n Journal of 
American Statistical Association, 74,269 - 277. 

Fay, R. E., Nelson, C.T., and Litow, L. (1993), "Estimation of Median Income for 
4-Person Families by State." In Indirect Estimators in Federal Programs 
Statistical Policy Working Paper 21. Statistical Policy Office, Office of 
Management and Budget, pp. 901—917. 

Ferguson, W.A. (1973), 14 A Bayesian Analysis of some Nonparametric Problems. 
"The Annals of Statistics, 1,209 -230. 

Ghosh, M., and Lahiri, P. (1987a). u Robust Empirical Bayes Estimation of Means 
from Stratified Samples, * Journal of American Statistical Association. 
82,1153 - 1162. 

Ghosh, M., and Lahiri, P. (19S76), ^Robust Empirical Bayes Estimation of Vari- 
ance from Stratified Samples, "Sankhyd, B, 49, 7S - S9. 

Ghosh, M M and Meeden, G. (1986), ^Empirical Bayes Estimation in Finite Popu- 
lation Sampling, "Journal of American Statistical Association, 81. 105S- 
1062. 

Ghosh, M., Nangia, N., and Kim, D. (1996), "Estimation of Median Income of 
Four- Person Families: A Bayesian Time Series Approach, " Journal of 
American Statistical Association, 91, 1423 - 1431. 

Ghosh, M., and Rao, J.N.K. (1994), "Small Area Estimation: An Appraisal". 
Statistical Science, 9, iVo.l, 55 - 93, (with discussion). 

Goldstein, M. (1975), "A Note on Some Bayesian Nonparametric Estimates/ The 



94 

Annals of Statistic, 3,736 - 740. 

Gonzalez, M. E. (1973). u Use and Evaluation of Synthetic Estimators, 7 ' Proceedings 
of the American Statistical Association, Social Statistics Section, 33-36. 
Washington D.C. 

Hartigan, J.A. (1969), "Linear Bayes methods," Journal of the Royal Statistical 
Society, B, 31,446-454. 

Harville, D.A. (1977). Maximum Likelihood Approaches to Variance Component 
Estimation and to Related Problems," Journal of American Statistical 
Association, 72,320 - 340. 

Henderson, C.R. (1975), "Best Linear Unbiased Estimation and Prediction under 
a Selection Model,* Biometrics, 31,423 - 447. 

Holt, D., Smith, T.M.F., and Tomberlein, T.J. (1979), tt A Model-Based Approach 
to Estimation for Small Subgroups of a Population," Journal of the Amer- 
ican Statistical Association . 74,405 — 410. 

Kackar, R.N., and Harville, D.A. (19S4), "Approximations for Standard Errors 
of Estimators of Fixed and Random Effects in Mixed Linear Models/ 
Journal of the American Statistical Association , 79,853 - 862. 

Kass, R.E., and StefFey. D. (1989), "Approximate Bayesian Inference in Condi- 
tionally Independent Hierarchical Models (Parametric Empirical Bayes 
Models)," Journal of the A merican Statistical Association , 84,717-726. 

Kleffe, J., and Rao, J.N.K. (1992), "Estimation of Mean Square Error of Empirical 
Best Unbiased under a Random Error Variance," Journal of Multivariate 
Analysis^ 43, 1 — 15. 

Lahiri, P., and Rao, J.N.Iv (1995), "Robust Estimation of Mean Squared Error 
of Small Area Estimators," Journal of American Statistical Association. 
90,758- 766. 

Lahiri, P., and Tiwari, R.C. (1991), "Nonparametrics Bayes and Empirical Bayes 



95 



Estimation of Variances from Stratified Samples," Sankhyd: The Indian 
Journal of Statistics , B, 52,105 - 118. 

Lahiri, R, and Wang, W. (1992), Estimation of all Employee Links for Small 
Domains - an Application of Empirical Bayes Procedure." Proceedings of 
the Workshop on Statistical Issues in Public Policy Analysis, Eds.. J.N.K. 
Rao, and N.G.N. Prasad, II - 32 - II- 53. 

Laird, N.M.,and Louis, T.A. (1937), "Empirical Bayes Confidence Intervals Based 
on Bootstrap Samples," Journal of the American Statistical Association 
, 82,739 -750. 

McCuliagh, P. and Zidek, J. (19S7), "Regression Methods ajid Performance Cri- 
teria for Small Area Population Estimation," in Eds. Platek, R., Rao, 
J.N.K., Sarndal, C.E. ? and Singh, M.P., Wiley, New York, pp. 62 - 74. 

Morris, C.N. (1983a), "Parametric Empirical Bayes Inference: Theory and Appli- 
cations," Journal of American Statistical Association, 78,47 — 59. 

Morris, C.N. (19836), "Parametric Empirical Bayes Confidence Intervals," In Sci- 
entific Inference, Data Analysis, and Robustness, New York; Academic 
Press, pp. 25 — 50. 

Morrison, P. (1971), Demographic Information for Cities: a Manual for Estimating 
and Projecting Local Population Characteristics. RAND report R-61S- 
HUD. 

Nandram, B., and Sedransk, J. (1993), "Empirical Bayes Estimation for the Finite 
Population Mean on the Current Occasion," Journal of the American 
Statistical Association , 88,994 - 1000. 

National Research Council (19S0), Panel on Small Area Estimates of Population 
and Income. Estimating Population and Income of Small Areas, National 
Academy Press. Washington, DC. 

Prasad, N.G.N, and Rao, J.N.K. (1990), "The Estimation of Mean Squared Errors 



96 

of Small Area Estimators, "Journal of American Statistical Association. 
85, 163 - 171. 

Purcell, N.J. and Kish, L. (1979), "Estimation for Small Domain, " Biometrics. 
35,365-384. 

Rao, J.N.K. (1936), "Synthetic Estimators, SPREE and Best Model Based Pre- 
dictors," in Proceedings of the Conference on Survey Research Methods 
in Agriculture, 1 — 16, U.S. Dept. Agriculture, Washington, DC. 

Robbins. H.(1955), An Empirical Bayes Approach to Statistics, Proceedings of 
the 3rd Berkeley Symposium on Mathematical Statistics and Probability 
(Vol.1), Berkeley: University of California Press 157 - 163. 

Royall, R. (1970), "On Finite Population Sampling Theory under Certain Regres- 
sion Models, " Biometrika, 57,377 — 3S7. 

Schaible, W.L. (1992), "Use of Small Area Statistics in U.S. Federal Programs/' 
in Small Area Statistics and Survey Designs (G. Kalton, J. Kordos and 
R. Platek, eds.) 1,95 - 114. Central Statistical Office, Warsaw. 

Singh, A.C, Stukel, D.M., and PfefFermann, D. (1997), "Bayesian Versus Frequen- 

tist Measures of Error in Small Area Estimation, " tentatively accepted 

in Journal of the Royal Statistical Society, B. 

» 

Skinner, C. (1993), "The Use of Synthetic Estimation Techniques to Produce Small 
Area Estimates," Tech. Report, Dept. of Social Statistics, University of 
Southampton. 

Statistics Canada 19S7, Population Estimation Methods, Canada, Catalogue 91 — 
528£. Statistics Cannda, Ottawa. 

Tierney, L., Kass, R.E., and Kadane, J.B. (19S9), "Fully Exponential Laplace 
Approximations to Expectations and Variances of Nonpositive Functions 
, "Journal of American Statistical Association, 84,710 — 716. 

Zidek, J.V. (19S2), "A Review of Methods for Estimating the Populations of Local 



Areas, * Technical Report S2 - 4, Univ. British Columbia, Vancouver. 




Q0£tfmA6o|OUi|38i 
SOKSZNVW 

03All03d 



